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(54) Method and apparatus for encoding bitstreams for seamless reproduction 



(57) An optical disk having such a data structure that 
moving image data and audio data are naturally repro- 
duced under one title without stoppage (freeze), etc., of 
video display at the connections of system streams 
(VOB) in which the data are interleaved when the data 



are reproduced by connecting the system streams 
(VOB) to each other. At least the first audio frame (Af) 
contains the same audio data in a plurality of branched 
stream systems (VOB) and at least the last GOP con- 
tains the same moving picture in a plurality of system 
streams (VOB) before connected. 
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Description 

Technical Field 

[0001] The present invention relates to a method and 
apparatus for system encoding bitstreams to connect 
seamlessly thereof and, more particularly, bitstreams for 
use in an authoring system for variously processing a 
data bitstream comprising the video data, audio data, 
and sub-picture data constituting each of plural program 
titles containing related video data, audio data, and sub- 
picture data content to generate a bitstream from which 
a new title containing the content desired by the user 
can be reproduced, and efficiently recording and repro- 
ducing said generated bitstream using a particular re- 
cording medium. 

Background Art 

[0002] Authoring systems used to produce program 
titles comprising related video data, audio data, and 
sub-picture data by digitally processing, for example, 
multimedia data comprising video, audio, and sub-pic- 
ture data recorded to laser disk or video CD formats are 
currently available. 

[0003] Systems using Video-CDs in particular are 
able to record video data to a CD format disk, which was 
originally designed with an approximately 600 MB re- 
cording capacity for storing digital audio data only, by 
using such high efficiency video compression tech- 
niques as MPEG. As a result of the increased effective 
recording capacity achieved using data compression 
techniques, karaoke titles and other conventional laser 
disk applications are gradually being transferred to the 
video CD format. 

[0004] Users today expect both sophisticated title 
content and high reproduction quality. To meet these ex- 
pectations, each title must be composed from bit- 
streams with an increasingly deep hierarchical struc- 
ture. The data size of multimedia titles written with bit- 
streams having such deep hierarchical structures, how- 
ever, is ten or more times greater than the data size of 
less complex titles. The need to edit small image (title) 
details also makes it necessary to process and control 
the bitstream using low order hierarchical data units. 
[0005] It is therefore necessary to develop and prove 
a bitstream structure and an advanced digital process- 
ing method including both recording and reproduction 
capabilities whereby a large volume, multiple level hier- 
archical digital bitstream can be efficiently controlled at 
each level of the hierarchy. Also needed are an appara- 
tus for executing this digital processing method, and a 
recording media to which the bitstream digitally proc- 
essed by said apparatus can be efficiently recorded for 
storage and from which said recorded information can 
be quickly reproduced. 

[0006] Means of increasing the storage capacity of 
conventional optical disks have been widely researched 



to address the recording medium aspect of this problem. 
One way to increase the storage capacity of the optical 
disk is to reduce the spot diameter D of the optical (laser) 
beam. If the wavelength of the laser beam is I and the 
5 aperture of the objective lens is NA, then the spot diam- 
eter D is proportional to l/NA, and the storage capacity 
can be efficiently improved by decreasing 1 and increas- 
ing NA. 

[0007] As described, for example, in United States 
10 Patent 5,235,581 , however, coma caused by a relative 
tilt between the disk surface and the optical axis of the 
laser beam (hereafter "tilt") increases when a large ap- 
erture (high NA) lens is used. To prevent tilt-induced co- 
ma, the transparent substrate must be made very thin. 
- 15 The problem is that the mechanical strength of the disk 
is low when the transparent substrate is very thin. 
[0008] MPEG1 , the conventional method of recording 
and reproducing video, audio, and graphic signal data, 
has also been replaced by the more robust MPEG2 
20 method, which can transfer large data volumes at a 
higher rate. It should be noted that the compression 
method and data format of the MPEG2 standard differ 
somewhat from those of MPEG1. The specific content 
of and differences between MPEG1 and MPEG2 are de- 
25 scribed in detail in the ISO-1 11 72 and ISO-1 381 8 MPEG 
standards, and further description thereof is omitted be- 
low. 

[0009] Note, however, that while the structure of the 
encoded video stream is defined in the MPEG2 specifi- 
30 cation, the hierarchical structure of the system stream 
and the method of processing lower hierarchical levels 
are not defined. 

[001 0] As described above, it is therefore not possible 
in a conventional authoring system to process a large 
35 data stream containing sufficient information to satisfy 
many different user requirements. Moreover, even if 
such a processing method were available, the proc- 
essed data recorded thereto cannot be repeatedly used 
to reduce data redundancy because there is no large 
capacity recording medium currently available that can 
efficiently record and reproduce high volume bitstreams 
such as described above. 

[0011] More specifically, particular significant hard- 
ware and software requirements must be satisfied in or- 

45 derto process a bitstream using a data unit smaller than 
the title. These specific hardware requirements include 
significantly increasing the storage capacity of the re- 
cording medium and increasing the speed of digital 
processing; software requirements include inventing an 

50 advanced digital processing method including a sophis- 
ticated data structure. 

[0012] Therefore, the object of the present invention 
is to provide an effective authoring systemforcontrolling 
a multimedia data bitstream with advanced hardware 
55 and software requirements using a data unit smaller 
than the title to better address advanced user require- 
ments. 

[0013] To share data between plural titles and thereby 
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efficiently utilize optical disk capacity, multi-scene con- 
trol whereby scene data common to plural titles and the 
desired scenes on the same time-base from within multi- 
scene periods containing plural scenes unique to par- 
ticular reproduction paths can be freely selected and re- 
produced is desirable. 

[0014] However, when plural scenes unique to a re- 
production path within the multi-scene period are ar- 
ranged on the same time-base, the scene data must be 
contiguous. Unselected multi-scene data is therefore 
unavoidably inserted between the selected common 
scene data and the selected multi-scene data. The prob- 
lem this creates when reproducing multi-scene data is 
that reproduction is interrupted by this unselected scene 
data. 

[001 5] When one of the multiple scenes is connected 
to common scene data, the difference between the vid- 
eo reproduction time and the audio reproduction time 
differs on each of the reproduction paths because of the 
offset between the audio and video frame reproduction 
times. As a result, the audio or video buffer underflows 
at the scene connection, causing video reproduction to 
stop ("freeze") or audio reproduction to stop ("mute"), 
and thus preventing seamless reproduction. It will also 
be obvious that the difference between the audio and 
video reproduction times can cause a buffer underflow 
state even when common scene data is connected 1:1. 
[0016] Therefore, the object of the present invention 
is to provide a data structure whereby multi-scene data 
can be naturally reproduced as a single title without the 
video presentation stopping ("freezing") at one-to-one, 
one-to-many, or many-to-many scene connections in 
the system stream; a method for generating a system 
stream having said data structure; a recording appara- 
tus and a reproduction apparatus for recording and re- 
producing said system stream; and a medium to which 
said system stream can be recorded and from which 
said system stream can be reproduced by said record- 
ing apparatus and reproduction apparatus. 
[0017] The present application is based upon Japa- 
nese Patent Application No. 7-252735 and 8-041581, 
which were filed on September 29, 1995 and February 
28, 1996, respectively, the entire contents of which are 
expressly incorporated by reference herein. 

Disclosure of Invention 

[0018] The present invention has been developed 
with a view to substantially solving the above described 
disadvantages and has for its essential object to provide 
an optical disk for recording more than one system 
stream containing audio data and video data, wherein 
the audio data and video data of the plural system 
streams recorded to the optical disk are interleaved 
such that the difference between the input start times of 
the video data and audio data to the video buffer in the 
video decoder and the audio buffer in the audio decoder 
is less than the reproduction time of the number of audio 



frames that can be stored in the audio buffer plus one 
audio frame. 

Brief Description of Drawings 

5 

[0019] 

Fig. 1 is a graph schematically showing a structure 
of multi media bit stream according to the present 

10 invention, 

Fig. 2 is a block diagram showing an authoring en- 
coder according to the present invention, 
Fig. 3 is a block diagram showing an authoring de- 
coder according to the present invention, 

is Fig. 4 is a side view of an optical disk storing the 
multi media bit stream of Fig. 1, 
Fig. 5 is an enlarged view showing a portion con- 
fined by a circle of Fig. 4, 

Fig. 6 is an enlarged view showing a portion con- 
20 fined by a circle of Fig. 5, 

Fig. 7 is a side view showing a variation of the op- 
tical disk of Fig. 4, 

Fig. 8 is a side view showing another variation of 
the optical disk of Fig. 4, 
25 Fig. 9 is a plan view showing one example of track 
path formed on the recording surface of the optical 
disk of Fig. 4, 

Fig. 10 is a plan view showing another example of 
track path formed on the recording surface of the 
30 optical disk of Fig. 4, 

Fig. 11 is a diagonal view schematically showing 
one example of a track path pattern formed on the 
optical disk of Fig. 7, 

Fig. 12 is a plan view showing another example of 
35 track path formed on the recording surface of the 
optical disk of Fig. 7, 

Fig. 13 is a diagonal view schematically showing 
one example of a track path pattern formed on the 
optical disk of Fig. 8, 
to Fig. 14 is a plan view showing another example of 
track path formed on the recording surface of the 
optical disk of Fig. 8, 

Fig . 1 5 is a flow chart showing details of the decoder 
synchronization process of Fig. 66, 

45 Fig. 16 is a graph schematically showing the struc- 
ture of multimedia bit stream for use in Digital Video 
Disk system according to the present invention, 
Fig. 17 is a graph schematically showing the encod- 
ed video stream according to the present invention, 

so Fig. 18 is a graph schematically showing an internal 
structure of a video zone of Fig. 16. 
Fig . 1 9 is a graph schematically showing the stream 
management information according to the present 
invention, 

55 Fig. 20 is a graph schematically showing the struc- 
ture the navigation pack NV of Fig. 17, 
Fig. 21 is a graph in assistance of explaining a con- 
cept of parental lock playback control according to 
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the present invention, 

Fig. 22 is a graph schematically showing the data 
structure used in a digital video disk system accord- 
ing to the present invention, 

Fig. 23 is a graph in assistance of explaining a con- 5 
cept of Multi-angle scene control according to the 
present invention, 

Fig. 24 is a graph in assistance of explaining a con- 
cept of multi scene data connection, 
Fig. 25 is a block diagram showing a DVD encoder 1Q 
according to the present invention, 
Fig. 26 is a block diagram showing a DVD decoder 
according to the present invention, 
Fig. 27 is a graph schematically showing an encod- 
ing information table generated by the encoding 15 
system controller of Fig. 25, 
Fig. 28 is a graph schematically showing an encod- 
ing information tables, 

Fig. 29 is a graph schematically showing an encod- 
ing parameters used by the video encoder of Fig. 20 
25, 

Fig. 30 is a graph schematically showing an exam- 
ple of the contents of the program chain information 
according to the present invention, 
Fig. 31 is a graph schematically showing another 25 
example of the contents of the program chain infor- 
mation according to the present invention, 
Fig. 32 is a flow chart showing the encode param- 
eters generating operation for a system stream con- 
taining a single scene, 30 
Fig. 33 is a graph in assistance of explaining a con- 
cept of multi-angle scene control according to the 
present in invention, 

Fig. 34 is a flow chart, formed by Figs. 34A and 34B, 
showing an operation of the DVD encoder of Fig. 35 
25, 

Fig. 35 is a flow chart showing detailed of the en- 
code parameter production sub-routine of Fig. 34, 
Fig. 36 is a flow chart showing the detailed of the 
VOB data setting routine of Fig. 35, 40 
Fig. 37 is a flow chart showing the encode param- 
eters, generating operation for a seamless switch- 
ing, 

Fig. 38 is a flow chart showing the encode param- 
eters generating operation for a system stream, 
Fig. 39 is a graph showing simulated results of data 
input/output to the video buffer and audio buffer of 
the DVD decoder of Fig. 26, 
Fig. 40 is a graph in assistance of explaining a con- 
cept of parental control according to the present in- so 
vention, 

Fig. 41 is a graph in assistance of explaining the 
data input/output to the video buffer of the DVD de- 
coder DCD shown in Fig. 26 during contiguous re- 
production, 55. 
Fig. 42 is a graph in assistance of explaining a pos- 
sible problem under a parental lock control example 
shown in Fig. 40, 



Fig. 43 is a graph in assistance of explaining a re- 
production gap generated under parental lock con- 
trol, 

Fig. 44 is a graph showing a system streams pro- 
duced according to the present invention, 
Fig. 45 is a graph in assistance of explaining an op- 
eration whereby these system streams are connect- 
ed, 

Fig. 46 is a graph in assistance of explaining a meth- 
od of generating a system streams, 
Fig. 47 is a graph in assistance of explaining anoth- 
er method of producing a system stream, 
Fig. 48 is a graph showing a structure of the end of 
the second common system stream and the begin- 
nings of the two parental lock control system 
streams, 

Fig. 49 is a graph in assistance of explaing the dif- 
ference in the video reproduction time and audio re- 
production time of different reproduction paths, 
Fig. 50 is a block diagram showing an internal struc- 
ture of the system encoder in the DVD encoder of 
Fig. 25, 

Fig. 51 is a graph showing a structure of the end of 
the two parental lock control system streams and 
the beginning of the following common system 
stream Sse, 

Fig. 52 is a graph in assistance of explaining the 
difference in the video reproduction time and audio 
reproduction time of different reproduction paths, 
Fig. 53 is a flow chart showing details of system 
stream producing routine of Fig, 34, 
Fig. 54 is a graph in assistance of explaining an op- 
eration to calculate an audio data movement 
MFApl, 

Fig. 55 is a graph in assistance of explaining an op- 
eration to calculate an audio data movement 
MFAp2, 

Fig. 56 is a block diagram showing an internal struc- 
ture of the synchronizer of Fig. 26, 
Fig . 57 is a flow chart showing an operation execut- 
ed by the audio decoder controller of Fig. 26, 
Figs. 58 and 59 are graphs showing decoding infor- 
mation table produced by the decoding system con- 
troller of Fig. 26, 

Fig. 60 is a flow chart showing the operation of the 
DVD decoder DCD of Fig. 26, 
Fig. 61 is a flow chart showing details of reproduc- 
tion extracted PGC routing of Fig. 60, 
Fig. 62 is a flow chat showing details of the stream 
buffer data transfer process according to the 
present invention, 

Fig. 63 is a flow chart showing details of the non 
multi-angle decoding process of Fig. 62, 
Fig. 64 is a flow chart showing details of the non- 
multi-angled interleave process of Fig. 63, 
Fig. 65 is a flow chart showing details of the non- 
multi-angled contiguous block process, 
Fig. 66 is a flow chart showing details of decoding 
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data process of Fig. 64, performed by the stream 
buffer, is shown, 

Fig. 67 is a graph schematically showing an actual 
arrangement of data blocks recorded to a data re- 
cording track on a recording medium according to 
the present invention, 

Fig. 68 is a graph schematically showing contigu- 
ous block regions and interleaved block regions ar- 
ray, 

Fig. 69 is a graph schematically showing a content 
of a VTS title VOBS according to the present inven- 
tion, and 

Fig. 70 is a graph schematically showing an internal 
data structure of the interleaved block regions ac- 
cording to the present invention. 

Best Mode for Carrying Out the Invention 

[0020] The prevent invention is detailedly described 
with reference to the accompanying drawings. 

Data structure of the authoring system 

[0021] The logic structure of the multimedia data bit- 
stream processed using the recording apparatus, re- 
cording medium, reproduction apparatus, and authoring 
system according to the present invention is described 
first below with reference to Fig. 1. 
[0022] In this structure, one title refers to the combi- 
nation of video and audio data expressing program con- 
tent recognized by a user for education, entertainment, 
or other purpose. Referenced to a motion picture (mov- 
ie), one title may correspond to the content of an entire 
movie, or to just one scene within said movie. 
[0023] A video title set (VTS) comprises the bitstream 
data containing the information for a specific number of 
titles. More specifically, each VTS comprises the video, 
audio, and other reproduction data representing the 
content of each title in the set, and control data for con- 
trolling the content data. 

[0024] The video zone VZ is the video data unit proc- 
essed by the authoring system, and comprises a spe- 
cific number of video title sets. More specifically, each 
video zone is a linear sequence of K + 1 video title sets 
numbered VTS #0 - VTS #K where K is an integer value 
of zero or greater One video title set, preferably the first 
video title set VTS #0, is used as the video manager 
describing the content information of the titles contained 
in each video title set. 

[0025] The multimedia bitstream MBS is the largest 
control unit of the multimedia data bitstream handled by 
the authoring system of the present invention, and com- 
prises plural video zones vz. 

Authoring encoder EC 

[0026] A preferred embodiment of the authoring en- 
coder EC according to the present invention for gener- 



ating a new multimedia bitstream MBS by re-encoding 
the original multimedia bitstream MBS according to the 
scenario desired by the user is shown in Fig. 2. Note 
that the original multimedia bitstream MBS comprises a 

5 video stream St1 containing the video information, a 
sub-picture stream St3 containing caption text and other 
auxiliary video information, and the audio stream St5 
containing the audio information. 
[0027] The video and audio streams are the bit- 

10 streams containing the video and audio information ob- 
tained from the source within a particular period of time. 
The sub-picture stream is a bitstream containing mo- 
mentary video information relevant to a particular scene. 
The sub-picture data encoded to a single scene may be 

15 captured to video memory and displayed continuously 
from the video memory for plural scenes as may be nec- 
essary. 

[0028] When this multimedia source data St1, St3, 
and St5 is obtained from a live broadcast, the video and 

20 audio signals are supplied in real-time from a video cam- 
era or other imaging source; when the multimedia 
source data is reproduced from a video tape or other 
recording medium, the audio and video signals are not 
real-time signals. 

25 [0029] While the multimedia source stream is shown 
in Fig. 2 as comprising these three source signals, this 
is for convenience only, and it should be noted that the 
multimedia source stream may contain more than three 
types of source signals, and may contain source data 

30 for different titles. Multimedia source data with audio, 
video, and sub-picture data for plural titles are referred 
to below as multi-title streams. 
[0030] As shown in Fig. 2, the authoring encoder EC 
comprises a scenario editor 100, encoding system con- 

35 trailer 200, video encoder 300, video stream buffer 400, 
sub-picture encoder 500, sub-picture stream buffer 600, 
audio encoder 700, audio stream buffer 800, system en- 
coder 900, video zone formatter 1300, recorder 1200, 
and recording medium M. 

40 [0031] The video zone formatter 1300 comprises vid- 
eo object (VOB) buffer 1000, formatter 1100, and vol- 
ume and file structure formatter 1400. 
[0032] The bitstream encoded by the authoring en- 
coder EC of the present embodiment is recorded byway 

45 of example only to an optical disk. 

[0033] The scenario editor 100 of the authoring en- 
coder EC outputs the scenario data, i.e., the user-de- 
fined editing instructions. The scenario data controls ed- 
iting the corresponding parts of the multimedia bitstream 

50 MBS according to the user's manipulation of the video, 
sub-picture, and audio components of the original mul- 
timedia title. This scenario editor 100 preferably com- 
prises a display, speaker(s), keyboard, CPU, and 
source stream buffer The scenario editor 100 is con- 

55 nected to an external multimedia bitstream source from 
which the multimedia source data St1, St3, and St5 are 
supplied. 

[0034] The user is thus able to reproduce the video 
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and audio components of the multimedia source data 
using the display and speaker to confirm the content of 
the generated title. The user is then able to edit the title 
content according to the desired scenario using the key- 
board, mouse, and other command input devices while 5 
confirming the content of the title on the display and 
speakers. The result of this multimedia data manipula- 
tion is the scenario data St7. 

[0035] The scenario data St7 is basically a set of in- 
structions describing what source data is selected from 10 
all or a subset of the source data containing plural titles 
within a defined time period, and how the selected 
source data is reassembled to reproduce the scenario 
(sequence) intended by the user. Based on the instruc- 
tions received through the keyboard or other control de- 15 
vice, the CPU codes the position, length, and the relative 
time-based positions of the edited parts of the respec- 
tive multimedia source data streams St1, St3, and St5 
to generate the scenario data St7. 
[0036] The source stream buffer has a specific capac- 20 
ity, and is used to delay the multimedia source data 
streams St1, St3, and St5 a known time Td and then 
output streams St1, St3, and St5. 
[0037] This delay is required for synchronization with 
the editor encoding process. More specifically, when da- 25 
ta encoding and user generation of scenario data St7 
are executed simultaneously, i.e., when encoding im- 
mediately follows editing, time Td is required to deter- 
mine the content of the multimedia source data editing 
process based on the scenario data St7 as will be de- 30 
scribed further below. As a result, the multimedia source 
data must be delayed by time Td to synchronize the ed- 
iting process during the actual encoding operation. Be- 
cause this delay time Td is limited to the time required 
to synchronize the operation of the various system com- 35 
ponents in the case of sequential editing as described 
above, the source stream buffer is normally achieved by 
means of a high speed storage medium such as semi- 
conductor memory. 

[0038] During batch editing in which all multimedia *o 
source data is encoded at once ("batch encoded") after 
scenario data St7 is generated for the complete title, de- 
lay time Td must be long enough to process the com- 
plete title or longer. In this case, the source stream buffer 
may be a low speed, high capacity storage medium such *$ 
as video tape, magnetic disk, or optical disk. 
[0039] The structure (type) of media used for the 
source stream buffer may therefore be determined ac- 
cording to the delay time Td required and the allowable 
manufacturing cost. so 
[0040] The encoding system controller 200 is con- 
nected to the scenario editor 1 00 and receives the sce- 
nario data St7 therefrom. Based on the time-base posi- 
tion and length information of the edit segment con- 
tained in the scenario data St7, the encoding system 55 
controller 200 generates the encoding parameter sig- 
nals St9, St11 , and St13 for encoding the edit segment 
of the multimedia source data. The encoding signals 



St9, St11, and St1 3 supply the parameters used for vid- 
eo, sub-picture, and audio encoding, including the en- 
coding start and end timing. Note that multimedia source 
data St1 , St3, and St5 are output after delay time Td by 
the source stream buffer, and are therefore synchro- 
nized to encoding parameter signals St9, St11, and 
St13. 

[0041] More specifically, encoding parameter signal 
St9 is the video encoding signal specifying the encoding 
timing of video stream St1 to extract the encoding seg- 
ment from the video stream St1 and generate the video 
encoding unit. Encoding parameter signal St11 is like- 
wise the sub-picture stream encoding signal used to 
generate the sub-picture encoding unit by specifying the 
encoding timing for sub-picture stream St3. Encoding 
parameter signal St1 3 is the audio encoding signal used 
to generate the audio encoding unit by specifying the 
encoding timing for audio stream St5. 
[0042] Based on the time-base relationship between 
the encoding segments of streams Stt , St3, and St5 in 
the multimedia source data contained in scenario data 
St7, the encoding system controller 200 generates the 
timing signals St21, St23, and St25 arranging the en- 
coded multimedia-encoded stream in the specified time- 
base relationship. 

[0043] The encoding system controller 200 also gen- 
erates the reproduction time information IT defining the 
reproduction time of the title editing unit (video object, 
VOB), and the stream encoding data St33 defining the 
system encode parameters for multiplexing the encod- 
ed multimedia stream containing video, audio, and sub- 
picture data. Note that the reproduction time information 
IT and stream encoding data St33 are generated for the 
video object VOB of each title in one video zone VZ. 
[0044] The encoding system controller 200 also gen- 
erates the title sequence control signal St39, which de- 
clares the formatting parameters for formatting the title 
editing units VOB of each of the streams in a particular 
time-base relationship as a multimedia bitstream. More 
specifically, the title sequence control signal St39 is 
used to control the connections between the title editing 
units (VOB) of each title in the multimedia bitstream 
MBS, or to control the sequence of the interleaved title 
editing unit (VOBs) interleaving the title editing units 
VOB of plural reproduction paths. 
[0045] The video encoder 300 is connected to the 
source stream buffer of the scenario editor 100 and to 
the encoding system controller 200, and receives there- 
from the video stream St1 and video encoding parame- 
ter signal St9, respectively. Encoding parameters sup- 
plied by the video encoding signal St9 include the en- 
coding start and end timing, bit rate, the encoding con- 
ditions for the encoding start and end, and the material 
type. Possible material types include NTSC or PAL vid- 
eo signal, and telecine converted material. Based on the 
video encoding parameter signal St9, the video encoder 
300 encodes a specific part of the video stream St1 to 
generate the encoded video stream St15. 
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[0046] The sub-picture encoder 500 is similarly con- 
nected to the source stream buffer of the scenario editor 
100 and to the encoding system controller 200, and re- 
ceives therefrom the sub-picture stream St3 and sub- 
picture encoding parameter signal St11, respectively. 5 
Based on the sub-picture encoding parameter signal 
St11, the sub-picture encoder 500 encodes a specific 
part of the sub-picture stream St3 to generate the en- 
coded sub-picture stream St17. 
[0047] The audio encoder 700 is also connected to 
the source stream buffer of the scenario editor 100 and 
to the encoding system controller 200, and receives 
therefrom the audio stream St5 and audio encoding pa- 
rameter signal St13, which supplies the encoding start 
and end timing. Based on the audio encoding parameter 
signal St13, the audio encoder 700 encodes a specific 
part of the audio stream St5 to generate the encoded 
audio stream St19. 

[0048] The video stream buffer 400 is connected to 
the video encoder 300 and to the encoding system con- 
troller 200. The video stream buffer 400 stores the en- 
coded video stream St15 input from the video encoder 
300, and outputs the stored encoded video stream St1 5 
as the time-delayed encoded video stream St27 based 
on the timing signal St21 supplied from the encoding 
system controller 200. 

[0049] The sub-picture stream buffer 600 is similarly 
connected to the sub-picture encoder 500 and to the en- 
coding system controller 200. The sub-picture stream 
buffer 600 stores the encoded sub-picture stream St17 
output from the sub-picture encoder 500, and then out- 
puts the stored encoded sub-picture stream St17 as 
time-delayed encoded sub-picture stream St29 based 
on the timing signal St23 supplied from the encoding 
system controller 200. 

[0050] The audio stream buffer 800 is similarly con- 
nected to the audio encoder 700 and to the encoding 
system controller 200. The audio stream buffer 800 
stores the encoded audio stream St19 input from the 
audio encoder700, and then outputs the encoded audio 
stream St1 9 as the time-delayed encoded audio stream 
St31 based on the timing signal St25 supplied from the 
encoding system controller 200. 
[0051] The system encoder 900 is connected to the 
video stream buffer 400, sub-picture stream buffer 600, 
audio stream buffer 800, and the encoding system con- 
troller 200, and is respectively supplied thereby with the 
time-delayed encoded video stream St27, time-delayed 
encoded sub-picture stream St29, time-delayed encod- 
ed audio stream St31, and the stream encoding data 
St33. Note that the system encoder 900 is a multiplexer 
that multiplexes the time-delayed streams St27, St29, 
and St31 based on the stream encoding data St33 (tim- 
ing signal) to generate title editing unit (VOB) St35. The 
stream encoding data St33 contains the system encod- 
ing parameters, including the encoding start and end 
timing. 

[0052] The video zone formatter 1 300 is connected to 



the system encoder 900 and the encoding system con- 
troller 200 from which the title editing unit (VOB) St35 
and title sequence control signal St39 (timing signal) are 
respectively supplied. The title sequence control signal 
St39 contains the formatting start and end timing, and 
the formatting parameters used to generate (format) a 
multimedia bitstream MBS. The video zone formatter 
1300 rearranges the title editing units (VOB) St35 in one 
video zone VZ in the scenario sequence defined by the 
user based on the title sequence control signal St39 to 
generate the edited multimedia stream data St43. 
[0053] The multimedia bitstream MBS St43 edited ac- 
cording to the user-defined scenario is then sent to the 
recorder 1200. The recorder 1200 processes the edited 
multimedia stream data St43 to the data stream St45 
format of the recording medium M, and thus records the 
formatted data stream St45 to the recording medium M. 
Note that the multimedia bitstream MBS recorded to the 
recording medium M contains the volume file structure 
VFS, which includes the physical address of the data on 
the recording medium generated by the video zone for- 
matter 1300. 

[0054] Note that the encoded multimedia bitstream 
MBS St35 may be output directly to the decoder to im- 
mediately reproduce the edited title content. It will be 
obvious that the output multimedia bitstream MBS will 
not in this case contain the volume file structure VFS. 

Authoring decoder DC 

[0055] A preferred embodiment of the authoring de- 
coder DC used to decode the multimedia bitstream MBS 
edited by the authoring encoder EC of the present in- 
vention, and thereby reproduce the content of each title 
unit according to the user-defined scenario, is described 
next below with reference to Fig. 3. Note that in the pre- 
ferred embodiment described below the multimedia bit- 
stream St45 encoded by the authoring encoder EC is 
recorded to the recording medium M. 
[0056] As shown in Fig. 3, the authoring decoder DC 
comprises a multimedia bitstream producer 2000, sce- 
nario selector 2100, decoding system controller 2300, 
stream buffer 2400, system decoder 2500, video buffer 
2600, sub-picture buffer 2700, audio buffer 2800, syn- 
chronizer2900, video decoder 3800, sub-picture decod- 
er 3100, audio decoder 3200, synthesizer 3500, video 
data output terminal 3600, and audio data output termi- 
nal 3700. 

[0057] The bitstream producer 2000 comprises a re- 
cording media drive unit 2004 for driving the recording 
medium M; a reading head 2006 for reading the infor- 
mation recorded to the recording medium M and pro- 
ducing the binary read signal St57; a signal processor 
2008 for variously processing the read signal St57 to 
generate the reproduced bitstream St61; and a repro- 
duction controller 2002. 

[0058] The reproduction controller 2002 is connected 
to the decoding system controller 2300 from which the 
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multimedia bitstream reproduction control signal St53 is 
supplied, and in turn generates the reproduction control 
signals St55 and St59 respectively controlling the re- 
cording media drive unit (motor) 2004 and signal proc- 
essor 2008. 

[0059] So that the user-defined video, sub-picture, 
and audio portions of the multimedia title edited by the 
authoring encoder EC are reproduced, the authoring de- 
coder DC comprises a scenario selector 21 00 for select- 
ing and reproducing the corresponding scenes (titles). 
The scenario selector 2100 then outputs the selected 
titles as scenario data to the authoring decoder DC. 
[0060] The scenario selector 2100 preferably com- 
prises a keyboard, CPU, and monitor. Using the key- 
board, the user then inputs the desired scenario based 
on the content of the scenario input by the authoring en- 
coder EC. Based on the keyboard input, the CPU gen- 
erates the scenario selection data St51 specifying the 
selected scenario. The scenario selector 2100 is con- 
nected by an infrared communications device, for ex- 
ample, to the decoding system controller 2300, to which 
it inputs the scenario selection data St51. 
[0061] Based on the scenario selection data St51 , the 
decoding system controller 2300 then generates the bit- 
stream reproduction control signal St53 controlling the 
operation of the bitstream producer 2000. 
[0062] The stream buffer 2400 has a specific buffer 
capacity used to temporarily store the reproduced bit- 
stream St61 input from the bitstream producer 2000, ex- 
tract the address information and initial synchronization 
data SCR (system clock reference) for each stream, and 
generate bitstream control data St63. The stream buffer 
2400 is also connected to the decoding system control- 
ler 2300, to which it supplies the generated bitstream 
control data St63. 

[0063] The synchronizer 2900 is connected to the de- 
coding system controller 2300 from which it receives the 
system clock reference SCR contained in the synchro- 
nization control data St81 to set the internal system 
clock STC and supply the reset system clock St79 to the 
decoding system controller 2300. 
[0064] Based on this system clock St79, the decoding 
system controller 2300 also generates the stream read 
signal St65 at a specific interval and outputs the read 
signal St65 to the stream buffer 2400. 
[0065] Based on the supplied read signal St65, the 
stream buffer 2400 outputs the reproduced bitstream 
St61 at a specific interval to the system decoder 2500 
as bitstream St67. 

[0066] Based on the scenario selection data St51 , the 
decoding system controller 2300 generates the decod- 
ing signal St69 defining the stream Ids for the video, sub- 
picture, and audio bitstreams corresponding to the se- 
lected scenario, and outputs to the system decoder 
2500. 

[0067] Based on the instructions contained in the de- 
coding signal St69, the system decoder 2500 respec- 
tively outputs the video, sub-picture, and audio bit- 



streams input from the stream buffer 2400 to the video 
buffer 2600, sub-picture buffer 2700, and audio buffer 
2800 as the encoded video stream St71, encoded sub- 
picture stream St73, and encoded audio stream St75. 
5 [0068] The system decoder 2500 detects the presen- 
tation time stamp PTS and decoding time stamp DTS of 
the smallest control unit in each bitstream St67 to gen- 
erate the time information signal St77. This time infor- 
mation signal St77 is supplied to the synchronizer 2900 
10 through the decoding system controller 2300 as the syn- 
chronization control data St81. 
[0069] Based on this synchronization control data 
St81, the synchronizer 2900 determines the decoding 
start timing whereby each of the bitstreams will be ar- 
ts ranged in the correct sequence after decoding, and then 
generates and inputs the video stream decoding start 
signal St89 to the video decoder 3800 based on this de- 
coding timing. The synchronizer 2900 also generates 
and supplies the sub-picture decoding start signal St91 
20 and audio stream decoding start signal St93 to the sub- 
picture decoder 3100 and audio decoder 3200, respec- 
tively. 

[0070] The video decoder 3800 generates the video 
output request signal St84 based on the video stream 
25 decoding start signal St89, and outputs to the video buff- 
er 2600. In response to the video output request signal 
St84, the video buffer 2600 outputs the video stream 
St83 to the video decoder 3800. The video decoder 
3800 thus detects the presentation time information 
30 contained in the video stream St83, and disables the vid- 
eo output request signal St84 when the length of the re- 
ceived video stream St83 is equivalent to the specified 
presentation time. A video stream equal in length to the 
specified presentation time is thus decoded by the video 
35 decoder 3800, which outputs the reproduced video sig- 
nal St104 to the synthesizer 3500. 
[0071] The sub-picture decoder 31 00 similarly gener- 
ates the sub-picture output request signal St86 based 
on the sub-picture decoding start signal St91, and out- 
40 puts to the sub-picture buffer 2700. In response to the 
sub-picture output request signal St86, the sub-picture 
buffer 2700 outputs the sub-picture stream St85 to the 
sub-picture decoder 3100. Based on the presentation 
time information contained in the sub-picture stream 
45 st85, the sub-picture decoder 31 00 decodes a length of 
the sub-picture stream St85 corresponding to the spec- 
ified presentation time to reproduce and supply to the 
synthesizer 3500 the sub-picture signal St99. 
[0072] The synthesizer 3500 superimposes the video 
so signal St104 and sub-picture signal St99 to generate 
and output the multi-picture video signal St105 to the 
video data output terminal 3600. 
[0073] The audio decoder 3200 generates and sup- 
plies to the audio buffer 2800 the audio output request 
55 signal St88 based on the audio stream decoding start 
signal St93. The audio buffer 2800 thus outputs the au- 
dio stream St87 to the audio decoder 3200. The audio 
decoder 3200 decodes a length of the audio stream 
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St87 corresponding to the specified presentation time 
based on the presentation time information contained in 
the audio stream St87, and outputs the decoded audio 
stream St101 to the audio data output terminal 3700. 
[0074] It is thus possible to reproduce a user-defined 
multimedia bitstream MBS in real-time according to a 
user-defined scenario. More specifically, each time the 
user selects a different scenario, the authoring decoder 
DC is able to reproduce the title content desired by the 
user in the desired sequence by reproducing the multi- 
media bitstream MBS corresponding to the selected 
scenario. 

[0075] It is therefore possible by means of the author- 
ing system of the present invention to generate a multi- 
media bitstream according to plural user-defined sce- 
narios by real-time or batch encoding multimedia source 
data in a manner whereby the substreams of the small- 
est editing units (scenes), which can be divided into plu- 
ral substreams, expressing the basic title content are ar- 
ranged in a specific time-base relationship. 
[0076] The multimedia bitstream thus encoded can 
then be reproduced according to the one scenario se- 
lected from among plural possible scenarios. It is also 
possible to change scenarios while playback is in 
progress, i.e., to select a different scenario and dynam- 
ically generate a new multimedia bitstream according to 
the most recently selected scenario. It is also possible 
to dynamically select and reproduce any of plural 
scenes while reproducing the title content according to 
a desired scenario. 

[0077] It is therefore possible by means of the author- 
ing system of the present invention to encode and not 
only reproduce but to repeatedly reproduce a multime- 
dia bitstream MBS in real-time. 
[0078] A detail of the authoring system is disclosed 
Japanese Patent Application filed September 27, 1996, 
and entitled and assigned to the same assignee as the 
present application. 

DVD 



[0079] An example of a digital video disk (DVD) with 
only one recording surface (a single-sided DVD) is 
shown in Fig. 4. 

[0080] The DVD recording medium RC1 in the pre- 
ferred embodiment of the invention comprises a data re- 
cording surface RS1 to and from which data is written 
and read by emitting laser beam LS, and a protective 
layer PL1 covering the data recording surface RS1. A 
backing layer BL1 is also provided on the back of data 
recording surface RS1. The side of the disk on which 
protective layer PL1 is provided is therefore referred to 
below as side SA (commonly "side A"), and the opposite 
side (on which the backing layer BL1 is provided) is re- 
ferred to as side SB ("side B"). Note that digital video 
disk recording media having a single data recording sur- 
face RS1 on only one side such as this DVD recording 
medium RC1 is commonly called a single-sided single 



layer disk. 

[0081] A detailed illustration of area C1 in Fig. 4 is 
shown in Fig. 5. Note that the data recording surface 
RS1 is formed by applying a metallic thin film or other 
5 reflective coating as a data layer 4109 on a first trans- 
parent layer 41 08 having a particular thickness T1 . This 
first transparent layer 4108 also functions as the protec- 
tive layer PL1. A second transparent substrate 4111 of 
a thickness T2 functions as the backing layer BL1 , and 

10 is bonded to the first transparent layer 4108 by means 
of an adhesive layer 4110 disposed therebetween. 
[0082] A printing layer 4112 for printing a disk label 
may also be disposed on the second transparent sub- 
strate 4111 as necessary. The printing layer 4112 does 

is not usually cover the entire surface area of the second 
transparent substrate 4111 (backing layer BL1 ), but only 
the area needed to print the text and graphics of the disk 
label. The area of second transparent substrate 4111 to 
which the printing layer 4112 is not formed may be left 

20 exposed. Light reflected from the data layer 4109 (me- 
tallic thin film) forming the data recording surface RS1 
can therefore be directly observed where the label is not 
printed when the digital video disk is viewed from side 
SB. As a result, the background looks like a silver-white 

25 over which the printed text and graphics float when the 
metallic thin film is an aluminum thin film, for example. 
[0083] Note that it is only necessary to provide the 
printing layer 41 12 where needed for printing, and it is 
not necessary to provide the printing Iayer4112 over the 

30 entire surface of the backing layer BL1 . 

[0084] A detailed illustration of area C2 in Rig. 5 is 
shown in Fig. 6. Pits and lands are molded to the com- 
mon contact surface between the first transparent layer 
4108 and the data layer 4109 on side SA from which 

35 data is read by emitting a laser beam LS, and data is 
recorded by varying the lengths of the pits and lands (i. 
e., the length of the intervals between the pits). More 
specifically, the pit and land configuration formed on the 
first transparent Iayer41 08 is transferred to the data lay- 

40 er4109. The lengths of the pits and lands is shorter, and 
the pitch of the data tracks formed by the pit sequences 
is narrower, than with a conventional Compact Disc 
(CD). The surface recording density is therefore greatly 
improved. 

45 [0085] Side SA of the first transparent layer 41 08 on 
which data pits are not formed is a flat surface. The sec- 
ond transparent substrate 41 1 1 is for reinforcement, and 
is a transparent panel made from the same material as 
the first transparent layer 4108 with both sides flat. 

50 Thicknesses T1 and T2 are preferably equal and com- 
monly approximately 0.6 mm, but the invention shall not 
be so limited. 

[0086] As with a CD, information is read by irradiating 
the surface with a laser beam LS and detecting the 
55 change in the reflectivity of the light spot. Because the 
objective lens aperture NA can be large and the wave- 
length 1 of the light beam small in a digital video disk 
system, the diameter of the light spot Ls used can be 
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reduced to approximately 1/1.6 the light spot needed to 
read a CD. Note that this means the resolution of the 
laser beam LS in the DVD system is approximately 1.6 
times the resolution of a conventional CD system. 
[0087] The optical system used to read data from the 5 
digital video disk uses a short 650 nm wavelength red 
semiconductorlaserand an objective lens with a 0.6 mm 
aperture NA. By thus also reducing the thickness T of 
the transparent panels to 0.6 mm, more than 5 GB of 
data can be stored to one side of a 120 mm diameter 10 
optical disk. 

[0088] It is therefore possible to store motion picture 
(video) images having an extremely large per unit data 
size to a digital video disk system disk without losing 
image quality because the storage capacity of a single- is 
sided, single-layer recording medium RC1 with one data 
recording surface RS1 as thus described is nearly ten 
times the storage capacity of a conventional CD. As a 
result, while the video presentation time of a conven- 
tional CD system is approximately 74 minutes if image 20 
quality is sacrificed, high quality video images with a vid- 
eo presentation time exceeding two hours can be re- 
corded to a DVD. 

[0089] The digital video disk is therefore well-suited 
as a recording medium for video images. 25 
[0090] A digital video disk recording medium with plu- 
ral recording surfaces RS as described above is shown 
in Figs. 7 and 8. The DVD recording medium RC2 shown 
in Fig. 7 comprises two recording surfaces, i.e., first re- 
cording surface RS1 and semi-transparent second re- 30 
cording surface RS2, on the same side, i.e. side SA, of 
the disk. Data can be simultaneously recorded or repro- 
duced from these two recording surfaces by using dif- 
ferent laser beams LS1 and LS2 for the first recording 
surface RS1 and the second recording surface RS2. It 35 
is also possible to read/write both recording surfaces 
RS1 and RS2 using only one of the laser beams LS1 or 
LS2. Note that recording media thus comprised are 
called "single-side, dual-layer disks." 
[0091] It should also be noted that while two recording *o 
surfaces RS1 and RS2 are provided in this example, it 
is also possible to produce digital video disk recording 
media having more than two recording surfaces RS. 
Disks thus comprised are known as "single-sided, multi- 
layer disks." 45 
[0092] Though comprising two recording surfaces 
similarly to the recording media shown in Fig. 7, the DVD 
recording medium RC3 shown in Fig. 8 has the record- 
ing surfaces on opposite sides of the disk, i. e., has the 
first data recording surface RS1 on side SA and the sec- so 
ond data recording surface RS2 on side SB. It will also 
be obvious that while only two recording surfaces are 
shown on one digital video disk in this example, more 
than two recording surfaces may also be formed on a 
double-sided digital video disk. As with the recording 55 
medium shown in Fig. 7, it is also possible to provide 
two separate laser beams LS1 and LS2 for recording 
surfaces RS1 and RS2, or to read/write both recording 
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surfaces RS1 and RS2 using a single laser beam. Note 
that this type of digital video disk is called a "double- 
sided, dual-layer disk." It will also be obvious that a dou- 
ble-sided digital video disk can be comprised with two 
or more recording surfaces per side. This type of disk is 
called a "double-sided, multi-layer disk." 
[0093] A plan view from the laser beam LS irradiation 
side of the recording surface RS of the DVD recording 
medium RC is shown in Fig. 9 and Fig. 10. Note that a 
continuous spiral data recording track TR is provided 
from the inside circumference to the outside circumfer- 
ence of the DVD. The data recording track TR is divided 
into plural sectors each having the same known storage 
capacity. Note that for simplicity only the data recording 
track TR is shown in Fig. 9 with more than three sectors 
per revolution. 

[0094] As shown in Fig. 9, the data recording track TR 
is normally formed clockwise inside to outside (see ar- 
row DrA) from the inside end point IA at the inside cir- 
cumference of disk RCA to the outside end point OA at 
the outside circumference of the disk with the disk RCA 
rotating counterclockwise RdA. This type of disk RCA is 
called a clockwise disk, and the recording track formed 
thereon is called a clockwise track TRA. 
[0095] Depending upon the application, the recording 
track TRB may be formed clockwise from outside to in- 
side circumference (see arrow DrB in Fig. 10) from the 
outside end point OB at the outside circumference of 
disk RCB to the inside end point IB at the inside circum- 
ference of the disk with the disk RCB rotating clockwise 
RdB. Because the recording track appears to wind 
counterclockwise when viewed from the inside circum- 
ference to the outside circumference on disks with the 
recording track formed in the direction of arrow DrB, 
these disks are referred to as counterclockwise disk 
RCB with counterclockwise track TRB to distinguish 
them from disk RCA in Fig. 9. Note that track directions 
DrA and DrB are the track paths along which the laser 
beam travels when scanning the tracks for recording 
and playback. Direction of disk rotation RdA in which 
disk RCA turns is thus opposite the direction of track 
path DrA, and direction of disk rotation RdB in which disk 
RCB turns is thus opposite the direction of track path 
DrB. 

[0096] An exploded view of the single-sided, dual-lay- 
er disk RC2 shown in Fig. 7 is shown as disk PC2o in 
Fig. 11 . Note that the recording tracks formed on the two 
recording surfaces run in opposite directions. Specifi- 
cally, a clockwise recording track TRA as shown in Fig. 
9 is formed in clockwise direction DrA on the (lower) first 
data recording surface RS1 , and a counterclockwise re- 
cording track TRB formed in counterclockwise direction 
DrB as shown in Fig. 10 is provided on the (upper) sec- 
ond data recording surface RS2. As a result, the outside 
end points OA and OB of the first and second (top and 
bottom) tracks are at the same radia! position relative to 
the center axis of the disk RC2o. Note that track paths 
DrA and DrB of tracks TR are also the data read/write 
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directions to disk RC. The first and second (top and bot- 
tom) recording tracks thus wind opposite each other with 
this disk RC, i.e., the track paths DrA and DrB of the top 
and bottom recording layers are opposite track paths. 
[0097] Opposite track path type, single-sided, dual- 5 
layer disks RC2o rotate in direction RdA corresponding 
to the first recording surface RS1 with the laser beam 
LS traveling along track path DrA to trace the recording 
track on the first recording surface RS1 . When the laser 
beam LS reaches the outside end point OA, the laser 
beam LS can be refocused to end point OB on the sec- 
ond recording surface RS2 to continue tracing the re- 
cording track from the first to the second recording sur- 
face uninterrupted. The physical distance between the 
recording tracks TRA and TRB on the first and second 
recording surfaces RS1 and RS2 can thus be instanta- 
neously eliminated by simply adjusting the focus of the 
laser beam LS. 

[0098] It is therefore possible with an opposite track 
path type, single-sided, dual-layer disk RC2o to easily 
process the recording tracks disposed to physically dis- 
crete top and bottom recording surfaces as a single con- 
tinuous recording track, it is therefore also possible in 
an authoring system as described above with reference 
to Fig. 1 to continuously record the multimedia bitstream 
MBS that is the largest multimedia data management 
unit to two discrete recording surfaces RS1 and RS2 on 
a single recording medium RC2o. 
[0099] It should be noted that the tracks on recording 
surfaces RS1 and RS2 can be wound in the directions 
opposite those described above, i.e., the counterclock- 
wise track TRB may be provided on the first recording 
surface RS1 and the clockwise track TRA on the second 
recording surface RS2. In this case the direction of disk 
rotation is also changed to a clockwise rotation RdB, 
thereby enabling the two recording surfaces to be used 
as comprising a single continuous recording track as de- 
scribed above. For simplification, a further example of 
this type of disk is therefore neith er shown nor described 
below. 

[0100] It is therefore possible by thus constructing the 
digital video disk to record the multimedia bitstream 
MBS for a feature-length title to a single opposite track 
path type, single-sided, dual-layer disk RC2o. Note that 
this type of digital video disk medium is called a single- 
sided dual-layer disk with opposite track paths. 
[0101] Another example of the single-sided, dual-lay- 
er DVD recording medium RC2 shown in Fig. 7 is shown 
as disk RC2p in Fig. 12. The recording tracks formed on 
both first and second recording surfaces RS1 and RS2 
are clockwise tracks TRA as shown in Fig. 9. In this 
case, the single-sided, dual-layer disk RC2p rotates 
counterclockwise in the direction of arrow RdA, and the 
direction of laser beam LS travel is the same as the di- 
rection of the track spiral, i.e., the track paths of the top 
and bottom recording surfaces are mutually parallel 
(parallel track paths). The outside end points OA of both 
top and bottom tracks are again preferably positioned 



at the same radial position relative to the center axis of 
the disk RC2p as described above. As also described 
above with disk RC2o shown in Fig. 11, the access point 
can be instantaneously shifted from outside end point 
OA of track TRA on the first recording surface RS1 to 
the outside end point OA of track TRA on the second 
recording surface RS2 by appropriately adjusting the fo- 
cus of the laser beam LS at outside end point OA. 
[01 02] However, for the laser beam LS to continuous- 
ly access the clockwise recording track TRA on the sec- 
ond recording surface RS2, the recording medium 
RC2p must be driven in the opposite direction (clock- 
wise, opposite direction RdA). Depending on the radial 
position of the laser beam LS, however, it is inefficient 
to change the rotational direction of the recording medi- 
um. As shown by the diagonal arrow in Fig. 12, the laser 
beam LS is therefore moved from the outside end point 
OA of the track on the first recording surface RS1 to the 
inside end point IA of the track on the second recording 
surface RS2 to use these physically discrete recording 
tracks as one logically continuous recording track. 
[0103] Rather than using the recording tracks on top 
and bottom recording surfaces as one continuous re- 
cording track, it is also possible to use the recording 
tracks to record the multimedia bitstreams MBS for dif- 
ferent titles. This type of digital video disk recording me- 
dium is called a "single-sided, dual-layer disk with par- 
allel track paths." 

[0104] Note that if the direction of the tracks formed 
on the recording surfaces RS1 and RS2 is opposite that 
described above, i.e., counterclockwise recording 
tracks TRB are formed, disk operation remains the 
same as that described above except for the direction 
of disk rotation, which is clockwise as shown by arrow 
RdB. 

[0105] Whether using clockwise or counterclockwise 
recording tracks, the single-sided, dual-layer disk RC2p 
with parallel track paths thus described is well-suited to 
storing on a single disk encyclopedia and similar multi- 
media bitstreams comprising multiple titles that are fre- 
quently and randomly accessed. 
[0106] An exploded view of the dual-sided single-lay- 
er DVD recording medium RC3 comprising one record- 
ing surface layer RS1 and RS2 on each side as shown 
in Fig. 8 is shown as DVD recording medium RC3s in 
Fig. 13. Clockwise recording track TRA is provided on 
the one recording surface RS1 , and a counterclockwise 
recording track TRB is provided on the other recording 
surface RS2. As in the preceding recording media, the 
outside end points OA and OB of the recording tracks 
on each recording surface are preferably positioned at 
the same radial position relative to the center axis of the 
DVD recording medium RC3s. 
[0107] Note that while the recording tracks on these 
recording surfaces RS1 and RS2 rotate in opposite di- 
rections, the track paths are symmetrical. This type of 
recording medium is therefore known as a double-sided 
dual layer disk with symmetrical track paths. This dou- 



15 



20 



25 



30 



35 



40 



45 



50 



11 



21 



EP 1 202 568 A2 



22 



ble-sided dual layer disk with symmetrical track paths 
RC3s rotates in direction RdA when reading/writing the 
first recording surface RS1. As a result, the track path 
on the second recording surface RS2 on the opposite 
side is opposite the direction DrB in which the track 
winds, i.e., direction DrA. Accessing both recording sur- 
faces RS1 and RS2 using a single laser beam LS is 
therefore not realistic irrespective of whether access is 
continuous or non-continuous. In addition, a multimedia 
bitstream MBS is separately recorded to the recording 
surfaces on the first and second sides of the disk. 
[01 08] A different example of the double-sided single 
layer disk RC3 shown in Fig. 8 is shown in Fig. 14 as 
disk RC3a. Note that this disk comprises clockwise re- 
cording tracks TRA as shown in Fig. 9 on both recording 
surfaces RS1 and RS2. As with the preceding recording 
media, the outside end points OA and OA of the record- 
ing tracks on each recording surface are preferably po- 
sitioned at the same radial position relative to the center 
axis of the DVD recording medium RC3a. Unlike the 
double-sided dual layer disk with symmetrical track 
paths RC3s described above, the tracks on these re- 
cording surfaces RS1 and RS2 are asymmetrical. This 
type of disk is therefore known as a double-sided dual 
layer disk with asymmetrical track paths. This double- 
sided dual layer disk with asymmetrical track paths 
RC3a rotates in direction RdA when reading/writing the 
first recording surface PS1. As a result, the track path 
on the second recording surface RS2 on the opposite 
side is opposite the direction DrA in which the track 
winds, i.e., direction DrB. 

[0109] This means that if a laser beam LS is driven 
continuously from the inside circumference to the out- 
side circumference on the first recording surface RS1, 
and then from the outside circumference to the inside 
circumference on the second recording surface RS2, 
both sides of the recording medium RC3a can be read/ 
written without turning the disk over and without provid- 
ing different laser beams for the two sides. 
[0110] The track paths for recording surfaces RS1 
and RS2 are also the same with this double-sided dual 
layer disk with asymmetrical track paths RC3a. As a re- 
sult, it is also possible to read/write both sides of the disk 
without providing separate laser beams for each side if 
the recording medium RC3a is turned over between 
sides, and the read/write apparatus can therefore be 
constructed economically. 

[0111] It should be noted that this recording medium 
remains functionally identical even if counterclockwise 
recording track TRB is provided in place of clockwise 
recording track TRA on both recording surfaces RS1 
and RS2. 

[0112] As described above, the true value of a DVD 
system whereby the storage capacity of the recording 
medium can be easily increased by using a multiple lay- 
er recording surface is realized in multimedia applica- 
tions whereby plural video data units, plural audio data 
units, and plural graphics data units recorded to a single 



disk are reproduced through interactive operation by the 
user. 

[0113] It is therefore possible to achieve one long- 
standing desire of software (programming) providers, 
5 specifically, to provide programming content such as a 
commercial movie on a single recording medium in plu- 
ral versions for different language and demographic 
groups while retaining the image quality of the original. 

10 Parental control 

[0114] Content providers of movie and video titles 
have conventionally had to produce, supply, and man- 
age the inventory of individual titles in multiple languag- 
es es, typically the language of each distribution market, 
and multi-rated title packages conforming to the paren- 
tal control (censorship) regulations of individual coun- 
tries in Europe and North America. The time and re- 
sources required for this are significant. While high im- 
20 age quality is obviously important, the programming 
content must also be consistently reproducible. 
[0115] The digital video disk recording medium is 
close to solving these problems. 

25 Multiple angles 

[0116] Oneinteractiveoperationwidelysoughtin mul- 
timedia applications today is for the user to be able to 
change the position from which a scene is viewed during 

30 reproduction of that scene. This capability is achieved 
by means of the multiple angle function. 
[0117] This multiple angle function makes possible 
applications whereby, for example, a user can watch a 
baseball game from different angles (or virtual positions 

35 in the stadium), and can freely switch between the views 
while viewing is in progress. In this example of a base- 
ball game, the available angles may include a position 
behind the backstop centered on the catcher, batter, and 
pitcher; one from behind the backstop centered on a 

40 fielder, the pitcher, and the catcher; and one from center 
field showing the view to the pitcher and catcher. 
[0118] To meet these requirements, the digital video 
disk system uses MPEG, the same basic standard for- 
mat used with Video-Cds to record the video, audio, 

45 graphics, and other signal data. Because of the differ- 
ences in storage capacity, transfer rates, and signal 
processing performance within the reproduction appa- 
ratus, DVD uses MPEG2, the compression method and 
data format of which differ slightly from the MPEG1 for- 

50 mat used with Video-Cds. 

[0119] It should be noted that the content of and dif- 
ferences between the MPEG1 and MPEG2 standards 
have no direct relationship to the intent of the present 
invention, and further description is therefore omitted 

55 below (for more information, see MPEG specifications 
ISO-11172and ISO-13818). 

[0120] The data structure of the DVD system accord- 
ing to the present invention is described in detail below 
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with reference to Figs. 16, 17, 18, 19, 20, and 21. 
Multi-scene control 

[0121] A fully functional and practical parental lock 
playback function and multi-angle scene playback func- 
tion must enable the user to modify the system output 
in minor, subtle ways while still presenting substantially 
the same video and audio output. If these functions are 
achieved by preparing and recording separate titles sat- 
isfying each of the many possible parental lock and mul- 
ti-angle scene playback requests, titles that are sub- 
stantially identical and differ in only minor ways must be 
recorded to the recording medium. This results in iden- 
tical data being repeatedly recorded to the larger part of 
the recording medium, and significantly reduces the uti- 
lization efficiency of the available storage capacity. More 
particularly, it is virtually impossible to record discrete 
titles satisfying every possible request even using the 
massive capacity of the digital video disk medium. While 
it may be concluded that this problem can be easily 
solved by increasing the capacity of the recording me- 
dium, this is an obviously undesirable solution when the 
effective use of available system resources is consid- 
ered. 

[0122] Using muiti-scene control, the concept of 
which is described in another section below, in a DVD 
system, it is possible to dynamically construct titles for 
numerous variations of the same basic content using the 
smallest possible amount of data, and thereby effective- 
ly utilize the available system resources (recording me- 
dium). More specifically, titles that can be played back 
with numerous variations are constructed from basic 
(common) scene periods containing data common to 
each title, and multi-scene periods comprising groups 
of different scenes corresponding to the various re- 
quests. During reproduction, the user is able to freely 
and at any time select particular scenes from the multi- 
scene periods to dynamically construct a title conform- 
ing to the desired content, e.g., a title omitting certain 
scenes using the parental lock control function. 
[0123] Note that multi-scene control enabling a paren- 
tal lock playback control function and multi-angle scene 
playback is described in another section below with ref- 
erence to Fig. 21. 

Data structure of the DVD system 

[0124] The data structure used in the authoring sys- 
tem of a digital video disk system according to the 
present invention is shown in Fig. 22. To record a mul- 
timedia bitstream MBS, this digital video disk system di- 
vides the recording medium into three major recording 
areas, the lead-in area LI, the volume space VS, and 
the lead-out area LO. 

[0125] The lead-in area LI is provided at the inside cir- 
cumference area of the optical disk. In the disks de- 
scribed with reference to Figs. 9 and 10, the lead-in area 



LI is positioned at the inside end points IA and IB of each 
track. Data for stabilizing the operation of the reproduc- 
ing apparatus when reading starts is written to the lead- 
in area LI. 

5 [0126] The lead-out area LO is correspondingly locat- 
ed at the outside circumference of the optical disk, i.e., 
at outside end points OA and OB of each track in the 
disks described with reference to Figs. 9 and 10. Data 
identifying the end of the volume space VS is recorded 

10 in this lead-out area LO. 

[0127] The volume space VS is located between the 
lead-in area LI and lead-out area LO, and is recorded 
as a one-dimensional array of n+1 (where n is an integer 
greater than or equal to zero) 2048-byte logic sectors 

15 LS. The logic sectors LS are sequentially number #0, 
#1, #2, ... #n. The volume space VS is also divided into 
a volume and file structure management area VFS and 
a file data structure area FDS. 
[0128] The volume and file structure management ar- 

20 ea VFS comprises m+1 logic sectors LS#0 to LS#m 
(where m is an integer greater than or equal to zero and 
less than n. The file data structure FDS comprises n-m 
logic sectors LS #m+1 to LS #n. 
[0129] Note that this file data structure area FDS cor- 

25 responds to the multimedia bitstream MBS shown in Fig. 
1 and described above. 

[0130] The volume file structure VFS is the file system 
for managing the data stored to the volume space VS 
as files, and is divided into logic sectors LS#0 - LS#m 

30 where m is the number of sectors required to store all 
data needed to manage the entire disk, and is a natural 
number less than n. Information for the files stored to 
the file data structure area FDS is written to the volume 
file structure VFS according to a known specification 

35 such as ISO-9660 or ISO-1 3346. 

[0131] The file data structure area FDS comprises n- 
m logic sectors LS#m - LS#n, each comprising a video 
manager VMG sized to an integer multiple of the logic 
sector (2048 x I, where I is a known integer), and k video 

40 title sets VTS #1 - VTS#k (where k is a natural number 
less than 100). 

[0132] The video manager VMG stores the title man- 
agement information for the entire disk, and information 
for building a volume menu used to set and change re- 
45 production control of the entire volume. 

[0133] Any video title set VTS #k is also called a "vid- 
eo file" representing a title comprising video, audio, and/ 
or still image data. 

[0134] The internal structure of each video title set 
50 VTS shown in Fig. 22 is shown in Fig. 16. Each video 
title set VTS comprises VTS information VTSI describ- 
ing the management information for the entire disk, and 
the VTS title video objects VOB (VTSTLVOBS), i.e., 
the system stream of the multimedia bitstream. The VTS 
55 information VTSI is described first below, followed by the 
VTS title VOBS. 

[01 35] The VTS information primarily includes the VT- 
SI management table VTSLMAT and VTSPGC infor- 
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mation table VTS_PGCIT. 

[0136] The VTSI management table VTSI.MAT 
stores such information as the internal structure of the 
video title set VTS, the number of selectable audio 
streams contained in the video title set VTS, the number 5 
of sub-pictures, and the video title set VTS location (stor- 
age address). 

[0137] The VTSPGC information table VTS_PGCIT 
records i (where i is a natural number) program chain 
(PGC) data blocks VTS_PGCI #1 - VTS_PGCI #i for 10 
controlling the playback sequence. Each of the table en- 
tries VTS_PGCI #i is a data entry expressing the pro- 
gram chain, and comprises j (where j is a natural 
number) cell playback information blocks C_PBI #1 - 
C_PBI #j. Each cell playback information block C_PBI 15 
#j contains the playback sequence of the cell and play- 
back control information. 

[0138] The program chain PGC is a conceptual struc- 
ture describing the story of the title content, and there- 
fore defines the structure of each title by describing the 20 
cell playback sequence. Note that these cells are de- 
scribed in detail below. 

[0139] If, for example, the video title set information 
relates to the menus, the video title set information VTSI 
is stored to a buffer in the playback device when play- 25 
back starts. If the user then presses a MENU button on 
a remote control device, for example, during playback, 
the playback device references the buffer to fetch the 
menu information and display the top menu #1. If the 
menus are hierarchical, the main menu stored as pro- 30 
gram chain information VTS_PGCI #1 may be dis- 
played, for example, by pressing the MENU button, 
VTS_PGC! #2 - #9 may correspond to submenus ac- 
cessed using the numeric keypad on the remote control, 
and VTS_PGCI #10 and higher may correspond to ad- 35 
ditional submenus further down the hierarchy. Alterna- 
tively, VTS_PGCI #1 may be the top menu displayed by 
pressing the MENU button, while VTS_PGCI #2 and 
higher may be voice guidance reproduced by pressing 
the corresponding numeric key. *o 
[0140] The menus themselves are expressed by the 
plural program chains defined in this table. As a result, 
the menus may be freely constructed in various ways, 
and shall not be limited to hierarchical or non-hierarchi- 
cal menus or menus containing voice guidance. 45 
[0141] In the case of a movie, for example, the video 
title set information VTSI is stored to a buffer, in the play- 
back device when playback starts, the playback device 
references the cell playback sequence described by the 
program chain PGC, and reproduces the system 50 
stream. 

[0142] The "cells" referenced here may be all or part 
of the system stream, and are used as access points 
during playback. Cells can therefore be used, for exam- 
ple, as the "chapters" into which a title may be divided. ss 
[0143] Note that each of the PGC information entries 
C_PBI #j contain both cell playback processing informa- 
tion and a cell information table. The cell playback 
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processing information comprises the processing infor- 
mation needed to reproduce the cell, such as the pres- 
entation time and number of repetitions. More specifi- 
cally, this information includes the cell block mode CBM, 
cell block type CBT, seamless playback flag SPF, inter- 
leaved allocation flag IAF, STC resetting flag STCDF, 
cell presentation time C_PBTM, seamless angle 
change flag SACF, first cell VOBU start address 
C_FVOBU_SA, and the last cell VOBU start address 
C_LVOBU_SA. 

[0144] Note that seamless playback refers to the re- 
production in a digital video disk system of multimedia 
data including video, audio, and sub-picture data with- 
out intermittent breaks in the data or information. Seam- 
less playback is described in detail in another section 
below with reference to Fig. 23 and Fig. 24. 
[0145] The cell block mode CBM indicates whether 
plural cells constitute one functional block. The cell play- 
back information of each cell in a functional block is ar- 
ranged consecutively in the PGC information. The cell 
block mode CBM of the first cell playback information in 
this sequence contains the value of the first cell in the 
block, and the cell block mode CBM of the last cell play- 
back information in this sequence contains the value of 
the last cell in the block. The cell block mode CBM of 
each cell arrayed between these first and last cells con- 
tains a value indicating that the cell is a cell between 
these first and last cells in that block. 
[0146] The cell block type CBT identifies the type of 
the block indicated by the cell block mode CBM. For ex- 
ample, when a multiple angle function is enabled, the 
cell information corresponding to each of the reproduc- 
ible angles is programmed as one of the functional 
blocks mentioned above, and the type of these function- 
al blocks is defined by a value identifying "angle" in the 
cell block type CBT for each cell in that block. 
[0147] The seamless playback flag SPF simply indi- 
cates whether the corresponding cell is to be linked and 
played back seamlessly with the cell or cell block repro- 
duced immediately therebefore. To seamlessly repro- 
duce a given cell with the preceding cell or cell block, 
the seamless playback flag SPF is set to 1 in the cell 
playback information for that ceil; otherwise SPF is set 
toO. 

[0148] The interleaved allocation flag IAF stores a val- 
ue identifying whether the cell exists in a contiguous or 
interleaved block. If the cell is part of an interleaved 
block, the flag IAF is set to 1 ; otherwise it is set to 0. 
[0149] The STC resetting flag STCDF identifies 
whether the system time clock STC used for synchroni- 
zation must be reset when the cell is played back; when 
resetting the system time clock STC is necessary, the 
STC resetting flag STCDF is set to 1 . 
[01 50] The seamless angle change flag SACF stores 
a value indicating whether a cell in a multi-angle period 
should be connected seamlessly at an angle change. If 
the angle change is seamless, the seamless angle 
change flag SACF is set to 1 ; otherwise it is set to 0. 
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[0151] The cell presentation time C_PBTM expresses 
the cell presentation time with video frame precision. 
[0152] The first cell VOBU start address 
C_FVOBU_SA is the VOBU start address of the first cell 
in a block, and is also expressed as the distance from 
the logic sector of the first cell in the VTS title VOBS 
(VTSTT_VOBS) as measured by the number of sectors. 
[0153] The last cell VOBU start address 
C_LVOBU_SA is the VOBU start address of the last cell 
in the block. The value of this address is expressed as 
the distance from the logic sector of the first cell in the 
VTS title VOBS (VTSTT_VOBS) as measured by the 
number of sectors. 

[0154] The VTS title VOBS (VTSTT_VOBS), i.e., the 
multimedia system stream data, is described next. The 
system stream data VTSTT_VOBS comprises i (where 
i is a natural number) system streams SS, each of which 
is referred to as a "video object" (VOB). Each video ob- 
ject VOB #1 - VOB #i comprises at least one video data 
block interleaved with up to a maximum eight audio data 
blocks and up to a maximum 32 sub-picture data blocks. 
[0155] Each video object VOB comprises q (where q 
is a natural number) cells C#1 - C#q. Each cell C com- 
prises r (where r is a natural number) video object units 
VOBU #1 - VOBU #r. 

[01 56] Each video object unit VOBU comprises plural 
groups_of_pictures GOP, and the audio and sub-pic- 
tures corresponding to the playback of said plural 
groups_of_pictures GOP. Note that the 
group_of_pictures GOP corresponds to the video en- 
coding refresh cycle. Each video object unit VOBU also 
starts with an NV pack, i.e., the control data for that VO- 
BU. 

[0157] The structure of the navigation packs NV is de- 
scribed with reference to Fig. 18. 
[0158] . Before describing the navigation pack NV, the 
internal structure of the video zone VZ (see Fig. 22), i. 
e., the system stream St35 encoded by the authoring 
encoder EC described with reference to Fig. 25, is de- 
scribed with reference to Fig. 1 7. Note that the encoded 
video stream St15 shown in Fig. 17 is the compressed 
one-dimensional video data stream encoded by the vid- 
eo encoder 300. The encoded audio stream St1 9 is like- 
wise the compressed one-dimensional audio data 
stream multiplexing the right and left stereo audio chan- 
nels encoded by the audio encoder 700. Note that the 
audio signal shall not be limited to a stereo signal, and 
may also be a multichannel surround-sound signal. 
[0159] The system stream (title editing unit VOB) St35 
is a one dimensional array of packs with a byte size cor- 
responding to the logic sectors LS #n having a 
2048-byte capacity as described using Fig. 21 . A stream 
control pack is placed at the beginning of the title editing 
unit (VOB) St35, i.e., at the beginning of the video object 
unit VOBU. This stream control pack is called the "nav- 
igation pack NV", and records the data arrangement in 
the system stream and other control information. 
[0160] The encoded video stream St15 and the en- 



coded audio stream St19 are packetized in byte units 
corresponding to the system stream packs. These pack- 
ets are shown in Fig. 17 as packets V1, V2, V3, V4... 
and A1, A2, A3.... As shown in Fig. 17, these packets 
5 are interleaved in the appropriate sequence as system 
stream St35, thus forming a packet stream, with consid- 
eration given to the decoder buffer size and the time re- 
quired by the decoder to expand the video and audio 
data packets. In the example shown in Fig. 17, the pack- 
to et stream is interleaved in the sequence V1 , V2, A1 , V3, 
V4, A2.... 

[0161] Note that the sequence shown in Fig. Winter- 
leaves one video data unit with one audio data unit. Sig- 
nificantly increased recording/playback capacity, high 

15 speed recording/playback, and performance improve- 
ments in the signal processing LSI enable the DVD sys- 
tem to record plural audio data and plural sub-picture 
data (graphics data) to one video data unit in a single 
interleaved MPEG system stream, and thereby enable 

20 the user to select the specific audio data and sub-picture 
data to be reproduced during playback. The structure of 
the system stream used in this type of DVD system is 
shown in Fig. 18 and described below. 
[0162] As in Fig. 17, the packetized encoded video 

25 stream St1 5 is shown in Fig. 18 as V1, V2, V3, V4, ... In 
this example, however, there is not just one encoded 
audio stream St19, but three encoded audio streams 
St19A, St19B, and St19C input as the source data. 
There are also two encoded sub-picture streams St17A 

30 and St17B input as the source data sub-picture streams. 
These six compressed data streams, St15, St19A, 
St19B, St19C, St17A and St17B, are interleaved to a 
single system stream St35. 

[0163] The video data is encoded according to the 

35 MPEG specification with the group_of_ pictures GOP 
being the unit of compression. In general, each 
group_of_pictures GOP contains 15 frames in the case 
of an NTSC signal, but the specific number of frames 
compressed to one GOP is variable. The stream man- 

40 agement pack, which describes the management data 
containing, for example, the relationship between inter- 
leaved data, is also interleaved at the GOP unit interval. 
Because the group_of_ pictures GOP unit is based on 
the video data, changing the number of video frames 

45 per GOP unit changes the interval of the stream man- 
agement packs. This interval is expressed in terms of 
the presentation time on the digital video disk within a 
range from 0.4 sec. to 1.0 sec. referenced to the GOP 
unit. If the presentation time of contiguous plural GOP 

so units is less than 1 sec, the management data packs 
for the video data of the plural GOP units is interleaved 
to a single stream. 

[01 64] These management data packs are referred to 
as navigation packs NV in the digital video disk system. 
55 The data from one navigation pack NV to the packet im- 
mediately preceding the next navigation pack NV forms 
one video object unit VOBU. In general, one contiguous 
playback unit that can be defined as one scene is called 
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a video object VOB, and each video object VOB con- 
tains plural video object units VOBU. Data sets of plural 
video objects VOB form a VOB set (VOBS). Note that 
these data units were first used in the digital video disk. 
[0165] When plural of these data streams are inter- 5 
leaved, the navigation packs NV defining the relation- 
ship between the interleaved packs must also be inter- 
leaved at a defined unit known as the pack number unit. 
Each group_of_pictures GOP is normally a unit contain- 
ing approximately 0.5 sec. of video data, which is equiv- 10 
alent to the presentation time required for 12-15 
frames, and one navigation pack NV is generally inter- 
leaved with the number of data packets required for this 
presentation time. 

[0166] The stream management information con- is 
tained in the interleaved video, audio, and sub-picture 
data packets constituting the system stream is de- 
scribed below with reference to Fig. 19 As shown in Fig. 
19, the data contained in the system stream is recorded 
in a format packed or packetized according to the 20 
MPEG2 standard. The packet structure is essentially 
the same for video, audio, and sub-picture data. One 
pack in the digital video disk system has a 2048 byte 
capacity as described above, and contains a pack head- 
er PKH and one packet PES; each packet PES contains 25 
a packet header PTH and data block. 
[0167] The pack header PKH records the time at 
which that pack is to be sent from stream buffer 2400 to 
system decoder 2500 (see Fig. 26), i.e., the system 
clock reference SCR defining the reference time for syn- 30 
chronized audio-visual data playback. The MPEG 
standard assumes that the system clock reference SCR 
is the reference clock for the entire decoder operation. 
With such disk media as the digital video disk, however, 
time management specific to individual disk players can 35 
be used, and a reference clock for the decoder system 
is therefore separately provided. 
[0168] The packet header PTH similarly contains a 
presentation time stamp PTS and a decoding time 
stamp DTS, both of which are placed in the packet be- *o 
fore the access unit (the decoding unit). The presenta- 
tion time stamp PTS defines the time at which the video 
data or audio data contained in the packet should be 
output as the playback output after being decoded, and 
the decoding time stamp DTS defines the time at which 45 
the video stream should be decoded. Note that the pres- 
entation time stamp PTS effectively defines the display 
start timing of the access unit, and the decoding time 
stamp DTS effectively defines the decoding start timing 
of the access unit. If the PTS and DTS are the same so 
time, the DTS is omitted. 

[0169] The packet header PTH also contains an 8-bit 
field called the stream ID identifying the packet type, i. 
e., whether the packet is a video packet containing a 
video data stream, a private packet, or an MPEG audio 55 
packet. 

[0170] Private packets under the MPEG2 standard 
are data packets of which the content can be freely de- 
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fined. Private packet 1 in this embodiment of the inven- 
tion is used to carry audio data other than the MPEG 
audio data, and sub-picture data; private packet 2 car- 
ries the PCI packet and DSI packet. 
• [0171] Private packets 1 and 2 each comprise a pack- 
et header, private data area, and data area. The private 
data area contains an 8-bit sub-stream ID indicating 
whether the recorded data is audio data or sub-picture 
data. The audio data defined by private packet 2 may 
be defined as any of eight types #0 - #7 of linear PCM 
or AC-3 encoded data. Sub-picture data may be defined 
as one of up to 32 types #0 - #31 . 
[0172] The data area is the field to which data com- 
pressed according to the MPEG2 specification is written 
if the stored data is video data; linear PCM, AC-3, or 
MPEG encoded data is written if audio data is stored; 
or graphics data compressed by runlength coding is 
written if sub-picture data is stored. 
[0173] MPEG2-compressed video data may be com- 
pressed by constant bit rate (CBR) or variable bit rate 
(VBR) coding. With constant bit rate coding, the video 
stream is input continuously to the video buffer at a con- 
stant rate. This contrasts with variable bit rate coding in 
which the video stream is input intermittently to the video 
buffer, thereby making it possible to suppress the gen- 
eration of unnecessary code. Both constant bit rate and 
variable bit rate coding can be used in the digital video 
disk system. 

[01 74] Because MPEG video data is compressed with 
variable length coding, the data quantity in each 
group_of_pictures GOP is not constant. The video and 
audio decoding times also differ, and the time-base re- 
lationship between the video and audio data read from 
an optical disk, and the time-base relationship between 
the video and audio data output from the decoder, do 
not match. The method of time-base synchronizing the 
video and audio data is therefore described in detail be- 
low with reference to Fig. 26, but is described briefly be- 
low based on constant bit rate coding. 
[0175] The navigation pack NV structure is shown in 
Fig. 20. Each navigation pack NV starts with a pack 
header PKH, and contains a PCI packet and DSI packet. 
[0176] As described above, the pack header PKH 
records the time at which that pack is to be sent from 
stream buffer 2400 to system decoder 2500 (see Fig. 
26 ), i.e., the system clock reference SCR defining the 
reference time for synchronized audio-visual data play- 
back. 

[0177] Each PCI packet contains PCI General infor- 
mation (PCI_GI) and Angle Information for Non-seam- 
less playback (NMSL.AGLI). 

[0178] The PCI General information (PCI_GI) de- 
clares the display time of the first video frame (the Start 
PTM of VOBU (VOBU_S_PTM)), and the display time 
of the last video frame (End PTM of VOBU 
(VOBU_E_PTM)), in the corresponding video object 
unit VOBU with system clock precision (90 Khz). 
[0179] The Angle Information for Non-seamless play- 
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back (NMSL_AGLI) states the read start address of the 
corresponding video object unit VOBU when the angle 
is changed expressed as the number of sectors from the 
beginning of the video object VOB. Because there are 
nine or fewer angles in this example, there are nine an- 5 
gle address declaration cells: Destination Address of 
Angle Cell #1 for Non-seamless playback 
(NMSL_AGL_C1_DSTA) to Destination Address of An- 
gle Cell #9 for Non-seamless playback 
( N M SL_ AG L_C9_ DSTA) . . 10 
[0180] Each DSI packet contains DSI General Infor- 
mation (DSLGI), Seamless Playback Information 
(SML_PBI), and Angle Information for Seamless play- 
back (SML_AGLI). 

[0181] The DSI General Information (DSI_GI) de- 15 
dares the address of the last pack in the video object 
unit VOBU, i. e., the End Address for VOB (VOBU_EA), 
expressed as the number of sectors from the beginning 
of the video object unit VOBU. 

[01 82] While seamless playback is described in detail 20 
later, it should be noted that the continuously read data 
units must be interleaved (multiplexed) at the system 
stream level as an interleaved unit ILVU in order to 
seamlessly reproduce split or combined titles. Plural 
system streams interleaved with the interleaved unit IL- 25 
VU as the smallest unit are defined as an interleaved 
block. 

[0183] The Seamless Playback Information 
(SML_PBI) is declared to seamlessly reproduce the 
stream interleaved with the interleaved unit ILVU as the 30 
smallest data unit, and contains an Interleaved Unit Flag 
(ILVU flag) identifying whether the corresponding video 
object unit VOBU is an interleaved block. The ILVU flag 
indicates whether the video object unit VOBU is in an 
interleaved block, and is set to 1 when it is. Otherwise 35 
the ILVU flag is set to 0. 

[0184] When a video object unit VOBU is in an inter- 
leaved block, a Unit END flag is. declared to indicate 
whether the video object unit VOBU is the last VOBU in 
the interleaved unit ILVU. Because the interleaved unit *o 
ILVU is the data unit for continuous reading, the Unit 
END flag is set to 1 if the VOBU currently being read is 
the last VOBU in the interleaved unit ILVU. Otherwise 
the Unit END flag is set to 0. 

[0185] An Interleaved Unit End Address (ILVU_EA) 45 
identifying the address of the last pack in the ILVU to 
which the VOBU belongs, and the starting address of 
the next interleaved unit ILVU, Next Interleaved Unit 
Start Address (NT_ILVU_SA), are also declared when 
a video object unit VOBU is in an interleaved block. Both so 
the Interleaved Unit End Address (ILVU_EA) and Next 
Interleaved Unit Start Address (NT_ILVU_SA) are ex- 
pressed as the number of sectors from the navigation 
pack NV of that VOBU. 

[0186] When two system streams are seamlessly 55 
connected but the audio components of the two system 
streams are not contiguous, particularly immediately be- 
fore and after the seam, it is necessary to pause the au- 
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dio output to synchronize the audio and video compo- 
nents of the system stream following the seam. Note 
that non-contiguous audio may result from different au- 
dio signals being recording with the corresponding video 
blocks. With an NTSC signal, for example, the video 
frame cycle is approximately 33. 33 msec while the AC- 
3 audio frame cycle is 32 msec. 
[0187] To enable this resynchronization, audio repro- 
duction stopping times 1 and 2, i.e., Audio Stop PTM 1 
in VOB (VOB_A_STP_PTM1), and Audio Stop PTM 2 in 
VOB (VOB_AiSTP_PTM2), indicating the time at which 
the audio is to be paused; and audio reproduction stop- 
ping periods 1 and 2, i.e., Audio Gap Length 1 in VOB 
(VOB_A_GAP_LEN1) and Audio Gap Length 2 in VOB 
(VOB_A_GAP_LEN2), indicating for how long the audio 
is to be paused, are also declared in the DSI packet. 
Note that these times are specified at the system clock 
precision (90 Khz). 

[0188] The Angle Information for Seamless playback 
(SML_AGLI) declares the read start address when the 
angle is changed. Note that this field is valid when seam- 
less, multi-angle control is enabled. This address is also 
expressed as the number of sectors from the navigation 
pack NV of that VOBU. Because there are nine orfewer 
angles, there are nine angle address declaration cells: 
Destination Address of Angle Cell #1 for Seamless play- 
back (SML_AGL_C1_DSTA) to Destination Address of 
Angle Cell #9 for Seamless playback 
(SML_AGL_C9_DSTA). 

[0189] Note also that each title is edited in video ob- 
ject (VOB) units. Interleaved video objects (interleaved 
title editing units) are referenced as "VOBS"; and the 
encoded range of the source data is the encoding unit. 

DVD encoder 

[0190] A preferred embodiment of a digital video disk 
system authoring encoder ECD in which the multimedia 
bitstream authoring system according to the present in- 
vention is applied to a digital video disk system is de- 
scribed below and shown in Fig. 25. it will be obvious 
that the authoring encoder ECD applied to the digital vid- 
eo disk system, referred to below as a DVD encoder, is 
substantially identical to the authoring encoder EC 
shown In Fig. 2. The basic difference between these en- 
coders is the replacement in the DVD encoder ECD of 
the video zone formatter 1 300 of the authoring encoder 
EC above with a VOB buffer 1000 and formatter 1100. 
It will also be obvious that the bitstream encoded by this 
DVD encoder ECD is recorded to a digital video disk 
medium M. The operation of this DVD encoder ECD is 
therefore described below in comparison with the au- 
thoring encoder EC described above. 
[0191] As in the above authoring encoder EC, the en- 
coding system controller 200 generates control signals 
St9, St11, St13, St21, St23, St25, St33, and St39 based 
on the scenario data St7 describing the user-defined ed- 
iting instructions input from the scenario editor 1 00, and 
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controls the video encoder 300, sub-picture encoder 
500, and audio encoder 700 in the DVD encoder ECD. 
Note that the user-defined editing instructions in the 
DVD encoder ECD are a superset of the editing instruc- 
tions of the authoring encoder EC described above. 5 
[0192] Specifically, the user-defined editing instruc- 
tions (scenario data St7) in the DVD encoder ECD sim- 
ilarly describe what source data is selected from all or 
a subset of the source data containing plural titles within 
a defined time period, and how the selected source data 
is reassembled to reproduce the scenario (sequence) 
intended by the user. The scenario data St7 of the DVD 
encoder ECD, however, further contains such informa- 
tion as: the number of streams contained in the editing 
units, which are obtained by splitting a multi-title source 
stream into blocks at a constant time interval; the 
number of audio and sub-picture data cells contained in 
each stream, and the sub-picture display time and peri- 
od; whether the title is a multi-rated title enabling paren- 
tal lock control; whether the user content is selected 
from plural streams including, for example, multiple 
viewing angles; and the method of connecting scenes 
when the angle is switched among the multiple viewing 
angles. 

[0193] ' The scenario data St7 of the DVD encoder 
ECD also contains control information on a video object 
VOB unit basis. This information is required to encode 
the media source stream, and specifically includes such 
information as whether there are multiple angles or pa- 
rental control features. When multiple angle viewing is 
enabled, the scenario data St7 also contains the encod- 
ing bit rate of each stream considering data interleaving 
and the disk capacity, the start and end times of each 
control, and whether a seamless connection should be 
made between the preceding and following streams. 
[0194] The encoding system controller 200 extracts 
this information from the scenario data St7, and gener- 
ates the encoding information table and encoding pa- 
rameters required for encoding control. The encoding 
information table and encoding parameters are de- 
scribed with reference to Figs. 27, 28, and 29 below. 
[0195] The stream encoding data St33 contains the 
system stream encoding parameters and system en- 
coding start and end timing values required by the DVD 
system to generate the VOBs. These system stream en- 
coding parameters include the conditions for connecting 
one video object VOB with those before and after, the 
number of audio streams, the audio encoding informa- 
tion and audio Ids, the number of sub-pictures and the 
sub-picture Ids, the video playback starting time infor- 
mation VPTS, and the audio playback starting time in- 
formation APTS. 

[0196] The title sequence control signal St39 supplies 
the multimedia bitstream MBS formatting start and end 
timing information and formatting parameters declaring 
the reproduction control information and interleave in- 
formation. 

[0197] Based on the video encoding parameter and 



encoding start/end timing signal St9, the video encoder 
300 encodes a specific part of the video stream St1 to 
generate an elementary stream conforming to the 
MPEG2 Video standard defined in ISO-13818. This el- 
eme ntary stream Is output to the video stream buffer400 
as encoded video stream St15. 
[0198] Note that while the video encoder 300 gener- 
ates an elementary stream conforming to the MPEG2 
Video standard defined in ISO-1 3818, specific encoding 
parameters are input via the video encoding parameter 
signal St9, including the encoding start and end timing, 
bit rate, the encoding conditions for the encoding start 
and end, the material type, including whether the mate- 
rial is an NTSC or PAL video signal or telecine converted 
material, and whether the encoding mode is set for ei- 
ther open GOP or closed GOP encoding. 
[0199] The MPEG2 coding method is basically an in- 
terframe coding method using the correlation between 
frames for maximum signal compression, i.e., the frame 
being coded (the target frame) is coded by referencing 
frames before and/or after the target frame. However, 
intra-coded frames, i. e. , frames that are coded based 
solely on the content of the target frame, are also insert- 
ed to avoid error propagation and enable accessibility 
from mid-stream (random access). The coding unit con- 
taining at least one intra-coded frame ("intra-frame") is 
called a group_of_pictures GOP. 
[0200] A group_of_pictures GOP in which coding is 
closed completely within that GOP is known as a "closed 
GOP." A group_of_pictures GOP containing a frame 
coded with reference to a frame in a preceding or fol- 
lowing (ISO-13818 DOES NOT LIMIT P-and B-picture 
CODING to referencing PAST frames) 
group_of_pictures GOP is an "open GOP." It is therefore 
possible to playback a closed GOP using only that GOP. 
Reproducing an open GOP, however, also requires the 
presence of the referenced GOP, generally the GOP 
preceding the open GOP. 

[0201] The GOP is often used as the access unit. For 
example, the GOP may be used as the playback start 
point for reproducing a title from the middle, as a tran- 
sition point in a movie, or for fast-forward play and other 
special reproduction modes. High speed reproduction 
can be achieved in such cases by reproducing only the 
intra-frame coded frames In a GOP or by reproducing 
only frames in GOP units. 

[0202] Based on the sub-picture stream encoding pa- 
rameter signal St11, the sub-picture encoder 500 en- 
codes a specific part of the sub-picture stream St3 to 
generate a variable length coded bitstream of bit- 
mapped data. This variable length coded bitstream data 
is output as the encoded sub-picture stream St17 to the 
sub-picture stream buffer 600. 
[0203] Based on the audio encoding parameter signal 
St13, the audio encoder 700 encodes a specific part of 
the audio stream St5 to generate the encoded audio da- 
ta. This encoded audio data may be data based on the 
MPEG1 audio standard defined in ISO-11172 and the 
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MPEG2 audio standard defined in ISO-1 381 8, AC-3 au- 
dio data, or PCM (LPCM) data. Note that the methods 
and means of encoding audio data according to these 
standards are known and commonly available. 
[0204] The video stream buffer 400 is connected to 5 
the video encoder 300 and to the encoding system con- 
troller 200. The video stream buffer 400 stores the en- 
coded video stream St15 input from the video encoder 
300, and outputs the stored encoded video stream St15 
as the time-delayed encoded video stream St27 based 
on the timing signal St21 supplied from the encoding 
system controller 200. 

[0205] The sub-picture stream buffer 600 is similarly 
connected to the sub-picture encoder 500 and to the en- 
coding system controller 200. The sub-picture stream 
buffer 600 stores the encoded sub-picture stream St17 
input from the sub-picture encoder 500, and then out- 
puts the stored encoded sub-picture stream St17 as 
time-delayed encoded sub-picture stream St29 based 
on the timing signal St23 supplied from the encoding 
system controller 200. 

[0206] The audio stream buffer 800 is similarly con- 
nected to the audio encoder 700 and to the encoding 
system controller 200. The audio stream buffer 800 
stores the encoded audio stream St19 input from the 
audio encoder 700, and then outputs the encoded audio 
stream St19 as the time-delayed encoded audio stream 
St31 based on the timing signal St25 supplied from the 
encoding system controller 200. 
[0207] The system encoder 900 is connected to the 
video stream buffer 400, sub-picture stream buffer 600, 
audio stream buffer 800, and the encoding system con- 
troller 200, and is respectively supplied thereby with the 
time-delayed encoded video stream St27, time-delayed 
encoded sub-picture stream St29, time-delayed encod- 
ed audio stream St31 , and the system stream encoding 
parameter data St33. Note that the system encoder 900 
is a multiplexer that multiplexes the time-delayed 
streams St27, St29, and St31 based on the stream en- 
coding data St33 (timing signal) to generate title editing 
units (VOBs) St35. 

[0208] The VOB buffer 1000 temporarily stores the 
video objects VOBs produced by the system encoder 
900. The formatter 1 1 00 reads the delayed video objects 
VOB from the VOB buffer 1000 based on the title se- 
quence control signal St39 to generate one video zone 
VZ, and adds the volume file structure VFS to generate 
the edited multimedia stream data St43. 
[0209] . The multimedia bitstream MBS St43 edited ac- 
cording to the user-defined scenario is then sent to the 
recorder 1200. The recorder 1200 processes the edited 
multimedia stream data St43 to the data stream St45 
format of the recording medium M, and thus records the 
formatted data stream St45 to the recording medium M. 

DVD decoder 

[0210] A preferred embodiment of a digital video disk 



system authoring decoder DCD in which the multimedia 
bitstream authoring system of the present invention is 
applied to a digital video disk system is described below 
and shown in Fig. 26. The authoring decoder DCD ap- 
plied to the digital video disk system, referred to below 
as a DVD decoder DCD, decodes the multimedia bit- 
stream MBS edited using the DVD encoder ECD of the 
present invention, and recreates the content of each title 
according to the user-defined scenario. It will also be 
obvious that the multimedia bitstream St45 encoded by 
this DVD encoder ECD is recorded to a digital video disk 
medium M. 

[0211] The basic configuration of the DVD decoder 
DCD according to this embodiment is the same as that 
of the authoring decoder DC shown in Fig. 3. The differ- 
ences are that a different video decoder 3801 (shown 
as 3800 in Fig. 26) is used in place of the video decoder 
3800, and a reordering buffer 3300 and selector 3400 
are disposed between the video decoder 3801 and syn- 
thesizer 3500. 

[0212] Note that the selector 3400 is connected to the 
synchronizer 2900, and is controlled by a switching sig- 
nal St103. 

[0213] The operation of this DVD decoder DCD is 
therefore described below in comparison with the au- 
thoring decoder DC described above. 
[0214] As shown in Fig. 26, the DVD decoder DCD 
comprises a multimedia bitstream producer 2000, sce- 
nario selector 2100, decoding system controller 2300, 
stream buffer 2400, system decoder 2500, video buffer 
2600, sub-picture buffer 2700, audio buffer 2800, syn- 
chronizer 2900, video decoder 3801, reordering buffer 
3300, sub-picture decoder 3100, audio decoder 3200, 
selector 3400, synthesizer 3500, video data output ter- 
minal 3600, and audio data output terminal 3700. 
[0215] The bitstream producer 2000 comprises a re- 
cording media drive unit 2004 for driving the recording 
medium M; a reading head 2006 for reading the infor- 
mation recorded to the recording medium M and pro- 
ducing the binary read signal St57; a signal processor 
2008 for variously processing the read signal St57 to 
generate the reproduced bitstream 5t61; and a repro- 
duction controller 2002. 

[0216] The reproduction controller 2002 is connected 
to the decoding system controller 2300 from which the 
multimedia bitstream reproduction control signal St53 is 
supplied, and in turn generates the reproduction control 
signals St55 and St59 respectively controlling the re- 
cording media drive unit (motor) 2004 and signal proc- 
essor 2008. 

[0217] So that the user-defined video, sub-picture, 
and audio portions of the multimedia title edited by the 
authoring encoder EC are reproduced, the authoring de- 
coder DC comprises a scenario selector 21 00 for select- 
ing and reproducing the corresponding scenes (titles). 
The scenario selector 2100 then outputs the selected 
titles as scenario data to the DVD decoder DCD. 
[0218] The scenario selector 2100 preferably com- 
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prises a keyboard, CPU, and monitor. Using the key- 
board, the user then inputs the desired scenario based 
on the content of the scenario input by the DVD encoder 
ECD. Based on the keyboard input, the CPU generates 
the scenario selection data St51 specifying the selected 5 
scenario. The scenario, selector 2100 is connected to 
the decoding system controller 2300 by an infrared com- 
munications device, for example, and inputs the gener- 
ated scenario selection data St51 to the decoding sys- 
tem controller 2300. 10 
[0219] The stream buffer 2400 has a specific buffer 
capacity used to temporarily store the reproduced bit- 
stream St61 input from the bitstream producer 2000, ex- 
tract the volume file structure VFS, the initial synchroni- 
zation data SCR (system clock reference) in each pack, 15 
and the VOBU control information (DSI) in the naviga- 
tion pack NV, to generate the bitstream control data 
St63. The stream buffer 2400 is also connected to the 
decoding system controller2300, to which it supplies the 
generated bitstream control data St63. 20 
[0220] Based on the scenario selection data St51 
supplied by the scenario selector 2100, the decoding 
system controller 2300 then generates the bitstream re- 
production control signal St53 controlling the operation 
of the bitstream producer 2000. The decoding system 25 
controller 2300 also extracts the user-defined playback 
instruction data from the bitstream reproduction control 
signal St53, and generates the decoding information ta- 
ble required for decoding control. This decoding infor- 
mation table is described further below with reference 30 
to Figs. 58 and 59. The decoding system controller 2300 
also extracts the title information recorded to the optical 
disk M from the file data structure area FDS of the bit- 
stream control data St63 to generate the title information 
signal St200. Note that the extracted title information in- 35 
eludes the video manager VMG, VTS information VTSI, 
the PGC information entries C_PBI #j, and the cell pres- 
entation time C_PBTM. 

[0221] Note that the bitstream control data St63 is 
generated in pack units as shown in Fig. 19, and is sup- *o 
plied from the stream buffer 2400 to the decoding sys- 
tem controller 2300, to which the stream buffer 2400 is 
connected. 

[0222] The synchronizer 2900 is connected to the de- 
coding system controller 2300 from which it receives the 45 
system clock reference SCR contained in the synchro- 
nization control data St81 to set the internal system 
clock STC and supply the reset system clock St79 to the 
decoding system controller 2300. 

[0223] Based on this system clock St79, the decoding so 
system controller 2300 also generates the stream read 
signal St65 at a specific interval and outputs the read 
signal St65 to the stream buffer 2400. Note that the read 
unit in this case is the pack. 

[0224] The method of generating the stream read sig- 55 
nal St65 is described next. 

[0225] The decoding system controller 2300 com- 
pares the system clock reference SCR contained in the 
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stream control data extracted from the stream buffer 
2400 with the system clock St79 supplied from the syn- 
chronizer 2900, and generates the read request signal 
St65 when the system clock St79 is greater than the sys- 
tem clock reference SCR of the bitstream control data 
St63. Pack transfers are controlled by executing this 
control process on a pack unit. 
[0226] Based on the scenario selection data St51 , the 
decoding system controller 2300 generates the decod- 
ing signal St69 defining the stream Ids for the video, sub- 
picture, and audio bitstreams corresponding to the se- 
lected scenario, and outputs to the system decoder 
2500. 

[0227] When a title contains plural audio tracks, e.g. 
audio tracks in Japanese, English, French, and/or other 
languages, and plural sub-picture tracks for subtitles in 
Japanese, English, French, and/or other languages, for 
example, a discrete ID is assigned to each of the lan- 
guage tracks. As described above with reference to Fig. 
1 9, a stream ID is assigned to the video data and MPEG 
audio data, and a substream ID is assigned to the sub- 
picture data, AC-3 audio data, linear PCM data, and 
navigation pack NV information. While the user need 
never be aware of these ID numbers, the user can select 
the language of the audio and/or subtitles using the sce- 
nario selector 21 00. If English language audio is select- 
ed, for example, the ID corresponding to the English au- 
dio track is sent to the decoding system controller 2300 
as scenario selection data St51. The decoding system 
controller 2300 then adds this ID to the decoding signal 
St69 output to the system decoder 2500. 
[0228] Based on the instructions contained in the de- 
coding signal St69, the system decoder 2500 respec- 
tively outputs the video, sub-picture, and audio bit- 
streams input from the stream buffer 2400 to the video 
buffer 2600, sub-picture buffer 2700, and audio buffer 
2800 as the encoded video stream St71, encoded sub- 
picture stream St73, and encoded audio stream St75. 
Thus, when the stream ID input from the scenario se- 
lector 21 00 and the pack ID input from the stream buffer 
2400 match, the system decoder 2500 outputs the cor- 
responding packs to the respective buffers (i.e., the vid- 
eo buffer2600, sub-picture buffer 2700, and audio buffer 
2800). 

[0229] The system decoder 2500 detects the presen- 
tation time stamp PTS and decoding time stamp DTS of 
the smallest control unit in each bitstream St67 to gen- 
erate the time information signal St77. This time infor- 
mation signal St77 is supplied to the synchronizer 2900 
through the decoding system controller 2300 as the syn- 
chronization control data St81. 
[0230] Based on this synchronization control data 
St81, the synchronizer 2900 determines the decoding 
start timing whereby each of the bitstreams will be ar- 
ranged in the correct sequence after decoding, and then 
generates and inputs the video stream decoding start 
signal St89 to the video decoder 3801 based on this de- 
coding timing. The synchronizer 2900 also generates 
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and supplies the sub-picture decoding start signal St91 
and audio stream decoding start signal St93 to the sub- 
picture decoder 3100 and audio decoder 3200, respec- 
tively. 

[0231] The video decoder 3801 generates the video 5 
output request signal St84 based on the video stream 
decoding start signal St89, and outputs to the video buff- 
er 2600. In response to the video output request signal 
St84, the video buffer 2600 outputs the video stream 
St83 to the video decoder 3801. The video decoder 10 
3801 thus detects the presentation time information 
contained in the video stream St83, and disables the vid- 
eo output request signal St84 when the length of the re- 
ceived video stream St83 is equivalent to the specified 
presentation time. A video stream equal in length to the 15 
specified presentation time is thus decoded by the video 
decoder 3801 , which outputs the reproduced video sig- 
nal St95 to the reordering buffer 3300 and selector 3400. 
[0232] Because the encoded video stream is coded 
using the interframe correlations between pictures, the 20 
coded order and display order do not necessarily match 
on a frame unit basis. The video cannot, therefore, be 
displayed in the decoded order. The decoded frames 
are therefore temporarily stored to the reordering buffer 
3300. The synchronizer 2900 therefore controls the 25 
switching signal St103 so that the reproduced video sig- 
nal St95 output from the video decoder 3800 and the 
reordering buffer output St97 are appropriately selected 
and output in the display order to the synthesizer 3500. 
[0233] The sub-picture decoder 3100 similarly gener- 30 
ates the sub-picture output request signal St86 based 
on the sub-picture decoding start signal St91, and out- 
puts to the sub-picture buffer 2700. In response to the 
sub-picture output request signal St86, the sub-picture 
buffer 2700 outputs the sub-picture stream St85 to the 35 
sub-picture decoder 3100. Based on the presentation 
time information contained in the sub-picture stream 
St85, the sub-picture decoder 3100 decodes a length of 
the sub-picture stream St85 corresponding to the spec- 
ified presentation time to reproduce and supply to the *o 
synthesizer 3500 the sub-picture signal St99. 
[0234] The synthesizer 3500 superimposes the selec- 
tor 3400 output with the sub-picture signal St99 to gen- 
erate and output the video signal St1 05 to the video data 
output terminal 3600. 

[0235] The audio decoder 3200 generates and sup- 
plies to the audio buffer 2800 the audio output request 
signal St88 based on the audio stream decoding start 
signal St93. The audio buffer 2800 thus outputs the au- 
dio stream St87 to the audio decoder 3200. The audio so 
decoder 3200 decodes a length of the audio stream 
St87 corresponding to the specified presentation time 
based on the presentation time information contained in 
the audio stream St87, and outputs the decoded audio 
stream St101 to the audio data output terminal 3700. 55 
[0236] It is thus possible to reproduce a user-defined 
multimedia bitstream MBS in real-time according to a 
user-defined scenario. More specifically, each time the 



user selects a different scenario, the DVD decoder DCD 
is able to reproduce the title content desired by the user 
in the desired sequence by reproducing the multimedia 
bitstream MBS corresponding to the selected scenario. 
[0237] It should be noted that the decoding system 
controller 2300 may supply the title information signal 
St200 to the scenario selector 21 00 by means of the in- 
frared communications device mentioned above or an- 
other means. Interactive scenario selection controlled 
by the user can also be made possible by the scenario 
selector 2 100 extracting the title information recorded to 
the optical disk M from the file data structure area FDS 
of the bitstream control data St63 contained in the title 
information signal St200, and displaying this title infor- 
mation on a display for user selection. 
[0238] Note, further, that the stream buffer 2400, vid- 
eo buffer 2600, sub-picture buffer 2700, audio buffer 
2800, and reordering buffer 3300 are expressed above 
and in the figures as separate entities because they are 
functionally different. It will be obvious, however, that a 
single buffer memory can be controlled to provide the 
same discrete functionality by time-share controlled use 
of a buffer memory with an operating speed plural times 
faster than the read and write rates of these separate 
buffers. 

Multi-scene control 

[0239] The concept of multiple angle scene control 
according to the present invention is described below 
with reference to Fig. 21 . As described above, titles that 
can be played back with numerous variations are con- 
structed from basic scene periods containing data com- 
mon to each title, and multi-scene periods comprising 
groups of different scenes corresponding to the various 
scenario requests. In Fig. 21 , scenes 1 , 5, and 8 are the 
common scenes of the basic scene periods. The multi- 
angle scenes (angles 1 , 2, and 3) between scenes 1 and 
5, and the parental locked scenes (scenes 6 and 7) be- 
tween scenes 5 and 8, are the multi-scene periods. 
[0240] Scenes, taken from different angles, I.e., an- 
gles 1, 2, and 3 in this example, can be dynamically se- 
lected and reproduced during playback in the multi-an- 
gle scene period. In the parental locked scene period, 
however, only one of the available scenes, scenes 6 and 
7, having different content can be selected, and must be 
selected statically before playback begins. 
[0241] Which of these scenes from the multi-scene 
periods is to be selected and reproduced is defined by 
the useroperating the scenario selector 2100 and there- 
by generating the scenario selection data St51. In sce- 
nario 1 in Fig. 21 the user can freely select any of the 
multi-angle scenes, and scene 6 has been preselected 
for output in the parental locked scene period. Similarly 
in scenario 2, the user can freely select any of the multi- 
angle scenes, and scene 7 has been preselected for 
output in the parental locked scene period. 
[0242] With reference to Figs. 30 and 31 , furthermore, 
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the contents of the program chain information 
VTS_PGCI is described. In Fig. 30, the case that a sce- 
nario requested by the user is shown with respect to a 
VTSI data construction. The scenario 1 and scenario 2 
shown in Fig. 21 are described as program chain infor- 
mation VTS_PGC#1 and VTS_PGC#2. VTS_PGC#1 
describing the scenario 1 consists of cell playback infor- 
mation C_PBI#1 corresponding to scene 1, C_PBI#2, 
C_PBI#3, and C_PBI#4 within a multi-angle cell block, 
C_PBI#5 corresponding to scene 5, C_PBI#6 corre- 
sponding to scene 6, and C_PBI#7 corresponding to 
scene 8. 

[0243] VTS_PGCI#2 describing the scenario 2 con- 
sists of cell playback information C_PBI#1 correspond- 
ing to scene 1, C_PBI#2, C_PBI#3, and C_PBi#4 within 
a multi-angle cell block corresponding to a multi-angle 
scene, C_PBI#5 corresponding to scene 5, C_PBI#6 
corresponding to scene 7, and C_PBI#7 corresponding 
to scene 8. According to the digital video system data 
structure, a scene which is a control unit of a scenario 
is described as a cell which is a unit thereunder, thus a 
scenario requested by a user can be obtained. 
[0244] In Fig. 31, the case that a scenario requested 
by the user shown in Fig. 21 is shown with respect to a 
VOB data construction VTSTT_VOBS. As specifically 
shown in Fig. 31 , the two scenarios 1 and 2 use the 
same VOB data in common. With respect to a single 
scene commonly owned by each scenario, VOB#1 cor- 
responding to scene 1, VOB#5 corresponding to scene 
5, and VOB#8 corresponding to scene 8 are arranged 
in non-interleaved block which is the contiguous block. 
[0245] With respect to the multi-angle data commonly 
owned by scenarios 1 and 2, one angle scene data is 
constructed by a single VOB. Specifically speaking, an- 
gle 1 is constructed by VOB#2, and angle 2 is construct- 
ed by VOB#3, angle 3 is constructed by VOB#4. Thus 
constructed multi-angle data is formed as the inter- 
leaved block for the sake of switching between each an- 
gle and seamless reproduction of each angle data. 
Scenes 6 and 7 peculiar to scenarios 1 and 2, respec- 
tively, are formed as the interleaved block for the sake 
of seamless reproduction between common scenes be- 
fore and behind thereof as well as seamless reproduc- 
tion between each scene. 

[0246] As described in the above, the user's request- 
ing scenario shown in Fig. 21 can be realized by utilizing 
the video title playback control information shown in Fig. 
30 and the title playback VOB data structure shown in 
Fig. 31. 

Seamless playback 

[0247] The seamless playback capability briefly men- 
tioned above with regard to the digital video disk system 
data structure is described below. Note that seamless 
playback refers to the reproduction in a digital video disk 
system of multimedia data including video, audio, and 
sub-picture data without intermittent breaks in the data 
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or information between basic scene periods, between 
basic scene periods and multi-scene periods, and be- 
tween multi-scene periods. 

[0248] Hardware factors contributing to intermittent 
5 playback of this data and title content include decoder 
underflow, i.e., an imbalance between the source data 
input speed and the decoding speed of the input source 
data. 

[0249] Other factors relate to the properties of the 

10 playback data. When the playback data is data that must 
be continuously reproduced for a constant time unit in 
order for the user to understand the content or informa- 
tion, e.g., audio data, data continuity is lost when the 
required continuous presentation time cannot be as- 

15 sured. Reproduction of such information whereby the 
required continuity is assured is referred to as "contig- 
uous information reproduction," or "seamless informa- 
tion reproduction." Reproduction of this information 
when the required continuity cannot be assured is re- 

20 ferred to as "non-continuous information reproduction," 
or "non-seamless information reproduction." It is obvi- 
ous that continuous information reproduction and non- 
continuous information reproduction are, respectively, 
seamless and non-seamless reproduction. 

25 [0250] Note that seamless reproduction can be fur- 
ther categorized as seamless data reproduction and 
seamless information reproduction. Seamless data re- 
production is defined as preventing physical blanks or 
interruptions in the data playback (intermittent reproduc- 

30 tion) as a result of a buffer underflow state, for example. 
Seamless information reproduction is defined as pre- 
venting apparent interruptions in the information when 
perceived by the user (intermittent presentation) when 
recognizing information from the playback data where 

35 there are no actual physical breaks in the data repro- 
duction. 

Details of Seamless playback 

40 [0251] The specific method enabling seamless repro- 
duction as thus described is described later below with 
reference to Figs. 23 and 24. 

Interleaving 

45 

[0252] The DVD data system streams described 
above are recorded using an appropriate authoring en- 
coder EC as a movie or other multimedia title on a DVD 
recording medium. Note that the following description 
50 refers to a movie as the multimedia title being proc- 
essed, but it will be obvious that the invention shall not 
be so limited. 

[0253] Supplying a single movie in a format enabling 
the movie to be used in plural different cultural regions 
55 or countries requires the script to be recorded in the var- 
ious languages used in those regions or countries. It 
may even necessitate editing the content to conform to 
the mores and moral expectations of different cultures. 
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Even using such a large-capacity storage system as the 
DVD system, however, it is necessary to reduce the bit 
rate, and therefore the image quality, if plural full-length 
titles edited from a single common source title are re- 
corded to a single disk. This problem can be solved by 5 
recording the common parts of plural titles only once, 
and recording the segments different in each title for . 
each different title only. This method makes it possible 
to record plural titles for different countries or cultures 
to a single optical disk without reducing the bit rate, and, 10 
therefore, retaining high image quality. 
[0254] As shown in Fig. 21, the titles recorded to a 
single optical disk contain basic scene periods of scenes 
common to all scenarios, and multi-scene periods con- 
taining scenes specific to certain scenarios, to provide 15 
parental lock control and multi-angle scene control func- 
tions. 

[0255] In the case of the parental lock control function, 
titles containing sex scenes, violent scenes, or other 
scenes deemed unsuitable for children, I.e., so-called 20 
"adult scenes," are recorded with a combination of com- 
mon scenes, adult scenes, and children's scenes. 
These title streams are achieved by arraying the adult 
and children's scenes to multi-scene periods between 
the common basic scene periods. 25 
[0256] Multi-angle control can be achieved in a con- 
ventional single-angle title by recording plural multime- 
dia scenes obtained by recording the subjects from the 
desired plural camera angles to the multi-scene periods 
arrayed between the common basic scene periods. 30 
Note, however, that while these plural scenes are de- 
scribed here as scenes recorded from different camera 
angles (positions), it will be obvious that the scenes may 
be recorded from the same camera angle but at different 
times, data generated by computer graphics, or other 35 
video data. 

[0257] When data is shared between different scenar- 
ios of a single title, it is obviously necessary to move the 
laser beam LS from the common scene data to the non- 
common scene data during reproduction, i.e., to move 40 
the optical pickup to a different position on the DVD re- 
cording medium RC1 . The problem here is that the time 
required to move the optical pickup makes it difficult to 
continue reproduction without creating breaks in the au- 
dio or video, I.e., to sustain seamless reproduction. This 
problem can be theoretically solved by providing a track 
buffer (stream buffer 2400) to delay data output an 
amount equivalent to the worst access time. In general, 
data recorded to an optical disk is read by the optical 
pickup, appropriately processed, and temporarily stored so 
to the track buffer. The stored data is subsequently de- 
coded and reproduced as video or audio data. 

Definition of Interleaving 

55 

[0258] To thus enable the user to selectively excise 
scenes and choose from among plural scenes, a state 
wherein non-selected scene data is recorded inserted 
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between common scene data and selective scene data 
necessarily occurs because the data units associated 
with individual scenes are contiguously recorded to the 
recording tracks of the recording medium. If data is then 
read in the recorded sequence, non-selected scene da- 
ta must be accessed before accessing and decoding the 
selected scene data, and seamless connections with 
the selected scene is difficult. The excellent random ac- 
cess characteristics of the digital video disk system, 
however, make seamless connections with the selected 
scenes possible. 

[0259] In other words, by splitting scene-specific data 
into plural units of a specified data size, and interleaving 
plural split data units for different scenes in a predefined 
sequence that is recorded to disk within the jumping 
range whereby an data underflow state does not occur, 
it is possible to reproduce the selected scenes without 
data interruptions by intermittently accessing and de- 
coding the data specific to the selected scenes using 
these split data units. Seamless data reproduction is 
thereby assured. 

Interleaved block and Interleave unit 

[0260] The interleaving method enabling seamless 
data reproduction according to the present invention is 
described below with reference to Fig. 24 and Fig. 67. 
Shown in Fig. 24 is a case from which three scenarios 
may be derived, i.e., branching from one video object 
VOB-A to one of plural video objects VOB-B, VOB-C, 
and VOB-D, and then merging back again to a single 
video object VOB-E. The actual arrangement of these 
blocks recorded to a data recording track TR on disk is 
shown in Fig. 67. 

[0261] Referring to Fig. 67, VOB-A and VOB-E are 
video objects with independent playback start and end 
times, and are in principle arrayed to contiguous block 
regions. As shown in Fig. 24, the playback start and end 
times of VOB-B, VOB-C, and VOB-D are aligned during 
interleaving: The interleaved data blocks are then re- 
corded to disk to a contiguous interleaved block region. 
The contiguous block regions and interleaved block re- 
gions are then written to disk in the track path Dr direc- 
tion In the playback sequence. Plural video objects 
VOB, i.e., interleaved video objects VOBS, arrayed to 
the data recording track TR are shown in Fig. 67. 
[0262] Referring to Fig. 67, data regions to which data 
is continuously arrayed are called "blocks," of which 
there are two types: "contiguous block regions" in which 
VOB with discrete starting and end points are contigu- 
ously arrayed, and "interleaved block regions" in which 
plural VOB with aligned starting and end points are in- 
terleaved. The respective blocks are arrayed as shown 
in Fig. 68 in the playback sequence, i.e., block 1, block 
2, block 3, . . . block 7. 

[0263] As shown in Fig. 68, the VTS title VOBS 
(VTSTT.VOBS) consist of blocks 1 - 7, inclusive. Block 
1 contains VOB 1 alone. Blocks 2, 3, 5, and 7 similarly 
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discretely contain VOBS 2, 3, 6, and 10. Blocks 2, 3, 5, 
and 7 are thus contiguous block regions.- 
[0264] Block 4, however, contains VOB 4 and VOB 5 
interleaved together, while block 6 contains VOB 7, VOB 
8, and VOB 9 interleaved together. Blocks 4 and 6 are 
thus interleaved block regions. 
[0265] The internal data structure of the contiguous 
block regions is shown in Fig. 69 with VOB-i and VOB- 
j arrayed as the contiguous blocks in the VOBs. As de- 
scribed with reference to Fig. 16, VOB-i and VOB-j in- 
side the contiguous block regions are further logically 
divided into cells as the playback unit. Both VOB-i and 
VOB-j in this figure are shown comprising three cells 
CELL #1 , CELL #2, and CELL #3. 
[0266] Each cell comprises one or more video object 
unit VOBU with the video object unit VOBU defining the 
boundaries of the cell. Each cell also contains informa- 
tion identifying the position of the cell in the program 
chain PGC (the playback control information of the dig- 
ital video disk system). More specifically, this position 
information is the address of the first and last VOBU in 
the cell. As also shown in Fig. 69, these VOB and the 
cells defined therein are also recorded to a contiguous 
block region so that contiguous blocks are contiguously 
reproduced. Reproducing these contiguous blocks is 
therefore no problem. 

[0267] The internal data structure of the interleaved 
block regions is shown in Fig. 70. In the interleaved 
block regions each video object VOB is divided into in- 
terleaved units ILVU, and the interleaved units ILVU as- 
sociated with each VOB are alternately arrayed. Cell 
boundaries are defined independently of the interleaved 
units ILVU. For example, VOB-k is divided into four in- 
terleaved units ILVUkl, ILVUk2, ILVUk3, and ILVUk4, 
and are confined by a single cell CELL#k. VOB-k is like- 
wise divided into four interleaved units ILVUml, 
ILVUm2, ILVUm3, and ILVUm4, and is confined by a sin- 
cle cell CELL#m. Note that instead of a single cell 
CELL#korCELL#m, each of VOB-k and VOB-m can be 
divided into more than two cells. The interleaved units 
ILVU thus contains both audio and video data. 
[0268] In the example shown in Fig. 70, the inter- 
leaved units ILVUkl, ILVUk2, ILVUk3, and ILVUk4, and 
ILVUml, !LVUm2, ILVUm3, and JLVUm4, from two dif- 
ferent video objects VOB-k and VOB-m are alternately 
arrayed within a single interleaved block. By interleaving 
the interleaved units ILVU of two video objects VOB in 
this sequence, it is possible to achieve seamless repro- 
duction branching from one scene to one of plural 
scenes, and from one of plural scenes to one scene. 

Multi-scene control 

[0269] The multi-scene period is described together 
with the concept of multi-scene control according to the 
present invention using by way of example a title com- 
prising scenes recorded from different angles. 
[0270] Each scene in multi-scene control is recorded 
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from the same angle, but may be recorded at different 
times or may even be computer graphics data. The mul- 
ti-angle scene periods may therefore also be called mul- 
ti-scene periods. 

5 

Parental control 

[0271] The concept of recording plural titles compris- 
ing alternative scenes for such functions as parental 
10 lock control and recording director's cuts is described 
below using Fig. 40. 

[0272] An example of a multi-rated title stream provid- 
ing for parental lock control is shown in Fig. 40. When 
so-called "adult scenes" containing sex, violence, or 

is other scenes deemed unsuitable for children are con- 
tained in a title implementing parental lock control, the 
title stream is recorded with a combination of common 
system streams SSa, SSb, and Sse, an adult-oriented 
system stream SSc containing the adult scenes, and a 

20 child-oriented system stream SSd containing only the 
scenes suitable for children. Title streams such as this 
are recorded as a multi-scene system stream containing 
the adult-oriented system stream Ssc and the child-ori- 
ented system stream Ssd arrayed to the multi-scene pe- 

25 riod between common system streams Ssb and Sse. 
[0273] The relationship between each of the compo- 
nent titles and the system stream recorded to the pro- 
gram chain PGC of a title stream thus comprised is de- 
scribed below. 

30 [0274] The adult-oriented title program chain PGC1 
comprises in sequence the common system streams 
Ssa and Ssb, the adult-oriented system stream Ssc, and 
the common system stream Sse. The child-oriented title 
program chain PGC2 comprises in sequence the com- 

35 mon system streams Ssa and Ssb, the child-oriented 
system stream Ssd, and the common system stream 
Sse. 

[0275] By thus arraying the adult-oriented system 
stream Ssc and child-oriented system stream Ssd to a 

40 multi-scene period, the decoding method previously de- 
scribed can reproduce the title containing adult-oriented 
content by reproducing the common system streams 
Ssa and Ssb, then selecting and reproducing the adult- 
oriented system stream Ssc, and then reproducing the 

45 common system stream Sse as instructed by the adult- 
oriented title program chain PGC1 . By alternatively fol- 
lowing the child-oriented title program chain PGC2 and 
selecting the child-oriented system stream Ssd in the 
multi-scene period, a child-oriented title from which the 

50 adult-oriented scenes have been expurgated can be re- 
produced. 

[0276] This method of providing in the title stream a 
multi-scene period containing plural alternative scenes, 
selecting which of the scenes in the multi-scene period 
55 are to be reproduced before playback begins, and gen- 
erating plural titles containing essentially the same title 
content but different scenes in part, is called parental 
lock control. 



24 



47 



EP 1 202 568 A2 



48 



[0277] Note that parental lock control is so named be- 
cause of the perceived need to protect children from un- 
desirable content. From the perspective of system 
stream processing, however, parental lock control is a 
technology for statically generating different title 5 
streams by means of the user pre-selecting specific 
scenes from a multi-scene period. Note, further, that this 
contrasts with multi-angle scene control, which is a tech- 
nology for dynamically changing the content of a single 
title by means of the user selecting scenes from the mul- 10 
ti-scene period freely and in real-time during title play- 
back. 

[0278] This parental lock control technology can also 
be used to enable title stream editing such as when mak- 
ing the director's cut. The director's cut refers to the 15 
process of editing certain scenes from a movie to, for 
example, shorten the total presentation time. This may 
be necessary, for example, to edit a feature-length mov- 
ie for viewing on an airplane where the presentation time 
is too long for viewing within the flight time or certain 20 
content may not be acceptable. The movie director thus 
determines which scenes may be cut to shorten the 
movie. The title can then be recorded with both a full- 
length, unedited system stream and an edited system 
stream in which the edited scenes are recorded to multi- 25 
scene periods. At the transition from one system stream 
to another system stream in such applications, parental 
lock control must be able to maintain smooth playback 
image output. More specifically, seamless data repro- 
duction whereby a data underflow state does not occur 30 
in the audio, video, or other buffers, and seamless in- 
formation reproduction whereby no unnatural interrup- 
tions are audibly or visibly perceived in the audio and 
video playback, are necessary. 

35 

Multi-angle control 

[0279] The concept of multi-angle scene control in the 
present invention is described next with reference to Fig. 
33, In general, multimedia titles are obtained by record- *o 
ing both the audio and video information (collectively 
"recording" below) of the subject overtime T. The angled 
scene blocks #SC1, #SM1, #SM2, #SM3, and #SC3 
represent the multimedia scenes obtained at recording 
unit times T1, T2, and T3 by recording the subject at 45 
respective camera angles. Scenes #SM1, #SM2, and 
#SM3 are recorded at mutually different (first, second, 
and third) camera angles during recording unit time T2, 
and are referenced below as the first, second, and third 
angled scenes. so 
[0280] Note that the multi-scene periods referenced 
herein are basically assumed to comprise scenes re- 
corded from different angles. The scenes may, however, 
be recorded from the same angle but at different times, 
or they may be computer graphics data. The multi-angle 55 
scene periods are thus the multi-scene periods from 
which plural scenes can be selected for presentation in 
the same time period, whether or not the scenes are ac- 



tually recorded at different camera angles. 
[0281] Scenes #SC1 and #SC3 are scenes recorded 
at the same common camera angle during recording 
unit times T1 and T3, i.e., before and after the multi- 
angle scenes. These scenes are therefore called "com- 
mon angle scenes." Note that one of the multiple camera 
angles used in the multi-angle scenes is usually the 
same as the common camera angle. 
[0282] To understand the relationship between these 
various angled scenes, multi-angle scene control is de- 
scribed below using a live broadcast of a baseball game 
for example only. 

[0283] The common angle scenes #SC1 and #SC3 
are recorded at the common camera angle, which is 
here defined as the view from center field on the axis 
through the pitcher, batter, and catcher. 
[0284] The first angled scene #SM1 is recorded at the 
first multi-camera angle, i.e., the camera angle from the 
backstop on the axis through the catcher, pitcher, and 
batter. The second angled scene #SM2 is recorded at 
the second multi-camera angle, i.e., the view from cent- 
erfield on the axis through the pitcher, batter, and catch- 
er. Note that the second angled scene #SM2 is thus the 
same as the common camera angle in this example. It 
therefore follows that the second angled scene #SM2 is 
the same as the common angle scene #SC2 recorded 
during recording unit time T2. The third angled scene 
#SM3 is recorded at the third multi-camera angle, i.e., 
the camera angle from the backstop focusing on the in- 
field. 

[0285] The presentation times of the multiple angle 
scenes #SM1, #SM2, and #SM3 overlap in recording 
unit time T2; this period is called the "multi-angle scene 
period." By freely selecting one of the multiple angle 
scenes #SM1, #SM2, and #SM3 in this multi-angle 
scene period, the viewer is able to change his or her 
virtual viewing position to enjoy a different view of the 
game as though the actual camera angle is changed. 
Note that while there appears to be a time gap between 
common angle scenes #SC 1 and #SC3 and the multiple 
angle scenes #SM1 , #SM2, and #SM3 in Fig. 33, this is 
simply to facilitate the use of arrows in the figure for eas- 
ier description of the data reproduction paths repro- 
duced by selecting different angled scenes. There is no 
actual time gap during playback. 
[0286] Multi-angle scene control of the system stream 
based on the present invention is described next with 
reference to Fig. 23 from the perspective of connecting 
data blocks. The multimedia data corresponding to com- 
mon angle scene #SC is referenced as common angle 
data BA, and the common angle data BA in recording 
unit times T1 and T3 are referenced as BA1 and BA3, 
respectively. The multimedia data corresponding to the 
multiple angle scenes #SM1, #SM2, and #SM3 are ref- 
erenced as first, second, and third angle scene data 
MA1, MA2, and MA3. As previously described with ref- 
erence to Fig. 33, scenes from the desired angled can 
be viewed by selecting one of the multiple angle data 



25 



49 



EP1 202 568 A2 



50 



units MA1, MA2, and MA3. There is also no time gap 
between the common angle data BA1 and BA3 and the 
multiple angle data units MA1, MA2, and MA3. 
[0287] In the case of an MPEG system stream, how- 
ever, intermittent breaks in the playback information can 5 
result between the reproduced common and multiple 
angle data units depending upon the content of the data 
at the connection between the selected multiple angle 
data unit MA1, MA2, and MA3 and the common angle 
data BA (either the first common angle data BA1 before 10 
the angle selected in the multi-angle scene period or the 
common angle data BA3 following the angle selected in 
the multi-angle scene period). The result in this case is 
that the title stream is not naturally reproduced as a sin- 
gle contiguous title, i.e., seamless data reproduction is 15 
achieved but non-seamless information reproduction 
results. 

[0288] The multi-angle selection process whereby 
one of plural scenes is selectively reproduced from the 
multi-angle scene period with seamless information 20 
presentation to the scenes before and after is described 
below with application in a digital video disk system us- 
ing Fig. 23. 

[0289] Changing the scene angle, i.e., selecting one 
of the multiple angle data units MA1, MA2, and MA3, 25 
must be completed before reproduction of the preceding 
common angle data BA1 is completed. It is extremely 
difficult, for example, to change to a different angle data 
unit MA2 during reproduction of common angle data 
BA1 . This is because the multimedia data has a variable 30 
length coded MPEG data structure, which makes it dif- 
ficult to find the data break points (boundaries) in the 
selected data blocks. The video may also be disrupted 
when the angle is changed because inter-frame corre- 
lations are used in the coding process. The 35 
group_of_pictures GOP processing unit of the MPEG 
standard contains at least one refresh frame, and closed 
processing not referencing frames belonging to another 
GOP is possible within this GOP processing unit. 
[0290] In other words, if the desired angle data, e. g., *o 
MA3, is selected before reproduction reaches the multi- 
angle scene period, and at the latest by the time repro- 
duction of the preceding common angle data BA1 Is 
completed, the angle data selected from within the multi- 
angle scene period can be seamlessly reproduced. 45 
However, it is extremely difficult while reproducing one 
angle to select and seamlessly reproduce another angle 
within the same multi-angle scene period. It is therefore 
difficult when in a multi-angle scene period to dynami- 
cally select a different angle unit presenting, for exam- 50 
pie, a view from a different camera angle. 

Flow chart: encoder 

[0291] The encoding information table generated by 55 
the encoding system controller 200 from information ex- 
tracted from the scenario data St7 is described below 
referring to Fig. 27. 



[0292] The encoding information table contains VOB 
set data streams containing plural VOB corresponding 
to the scene periods beginning and ending at the scene 
branching and connecting points, and VOB data 
streams corresponding to each scene. These VOB set 
data streams shown in Fig. 27 are the encoding infor- 
mation tables generated at step #100 in Fig. 34 by the 
encoding system controller 200 for creating the DVD 
multimedia stream based on the user-defined title con- 
tent. 

[0293] The user-defined scenario contains branching 
points from common scenes to plural scenes, or con- 
nection points to other common scenes. The VOB cor- 
responding to the scene period delimited by these 
branching and connecting points is a VOB set, and the 
data generated to encode a VOB set is the VOB set data 
stream. The title number specified by the VOB set data 
stream is the title number TITLE_NO of the VOB set da- 
ta stream. 

[0294] The VOB Set data structure in Fig. 27 shows 
the data content for encoding one VOB set in the VOB 
set data stream, and comprises: the VOB set number 
VOBS_NO, the VOB number VOB_NO in the VOB set, 
the preceding VOB seamless connection flag 
VOB_Fsb, the following VOB seamless connection flag 
VOB_Fsf, the multi-scene flag VOB_Fp, the interleave 
flag VOB_Fi, the multi-angle flag VOB_Fm, the multi- 
angle seamless switching flag VOB_FsV, the maximum 
bit rate of the interleaved VOB ILV_BR, the number of 
interleaved VOB divisions ILVJDIV, and the minimum in- 
terleaved unit presentation time ILVU_MT. 
[0295] The VOB set number VOBS_NO is a sequen- 
tial number identifying the VOB set and the position of 
the VOB set in the reproduction sequence of the title 
scenario. 

[0296] The VOB number VOB_NO is a sequential 
number identifying the VOB and the position of the VOB 
in the reproduction sequence of the title scenario. 
[0297] The preceding VOB seamless connection flag 
VOB_Fsb indicates whether a seamless connection 
with the preceding VOB is required for scenario repro- 
duction. 

[0298] The following VOB seamless connection flag 
VOB_Fsf indicates whether there is a seamless connec- 
tion with the following VOB during scenario reproduc- 
tion. 

[0299] The multi-scene flag VOB_Fp identifies wheth- 
er the VOB set comprises plural video objects VOB. 
[0300] The interleave flag VOB_Fi identifies whether 
the VOB in the VOB set are interleaved. 
[0301 ] The multi-angle flag VOB_Fm identifies wheth- 
er the VOB set is a multi-angle set, 
[0302] The multi-angle seamless switching flag 
VOB_FsV identifies whether angle changes within the 
multi-angle scene period are seamless or not. 
[0303] The maximum bit rate of the interleaved VOB 
ILV_BR defines the maximum bit rate of the interleaved 
VOBs. 
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[0304] The number of interleaved VOB divisions 
ILV_DIV identifies the number of interleave units in the 
interleaved VOB. 

[0305] The minimum interleave unit presentation time 
ILVU_MT defines the time that can be reproduced when 5 
the bit rate of the smallest interleave unit at which a track 
buffer data underflow state does not occur is the maxi- 
mum bit rate of the interleaved VOB ILV_BR during in- 
terleaved block reproduction. 

[0306] The encoding information table for each VOB 10 
generated by the encoding system controller 200 based 
on the scenario data St7 is described below referring to 
Fig. 28. The VOB encoding parameters described below 
and supplied to the video encoder 300, audio encoder 
700, and system encoder 900 for stream encoding are 15 
produced based on this encoding information table. 
[0307] The VOB data streams shown in Fig. 28 are 
the encoding information tables generated at step #100 
in Fig. 34 by the encoding system controller 200 for cre- 
ating the DVD multimedia stream based on the user- 20 
defined title content. 

[0308] The encoding unit is the video object VOB, and 
the data generated to encode each video object VOB is 
the VOB data stream. For example, a VOB set compris- 
ing three angle scenes comprises three video objects 25 
VOB. The data structure shown in Fig. 28 shows the 
content of the data for encoding one VOB in the VOB 
data stream. 

[0309] The VOB data structure contains the video ma- 
terial start time VOB_VST, the video material end time 30 
VOB_VEND, the video signal type VOB_V_KIND, the 
video encoding bit rate V_BR, the audio material start 
time VOB_AST, the audio material end time 
VOB_AEND, the audio coding method VOB_A_KIND, 
and the audio encoding bit rate A_BR. 35 
[0310] The video material start time VOBJVST is the 
video encoding start time corresponding to the time of 
the video signal. 

[031 1] The video material end time VOB_VEND is the 
video encoding end time corresponding to the time of *o 
the video signal. 

[0312] The video material type VOB_V_KIND identi- 
fies whether the encoded material is in the NTSC or PAL 
format, for example, or is photographic material (a mov- 
ie, for example) converted to a television broadcast for- 45 
mat (so-called telecine conversion). 
<+ [031 3] The video encoding bit rate V_BR is the bit rate 
at which the video signal is encoded. 
[0314] The audio material start time VOB_AST is the 
audio encoding start time corresponding to the time of so 
the audio signal. 

[0315] The audio material end timeVOB_AENDisthe 
audio encoding end time corresponding to the time of 
the audio signal. 

[0316] The audio coding method VOB_A_KIND iden- 55 
titles the audio encoding method as AC-3, MPEG, or lin- 
ear PCM, for example. 

[0317] The audio encoding bit rate A_BR is the bit rate 
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at which the audio signal is encoded. 
[0318] The encoding parameters used by the video 
encoder 300, sub-picture encoder 500, and audio en- 
coder 700, and system encoder 900 for VOB encoding 
are shown in Fig. 29. The encoding parameters include: 
the VOB number VOB_NO, video encode start time 
V_STTM, video encode end time V_ENDTM, the video 
encode mode V_ENCMD, the video encode bit rate 
V_RATE, the maximum video encode bit rate 
V_MRATE, the GOP structure fixing flag GOP_Fxflag, 
the video encode GOP structure GOPST, the initial vid- 
eo encode data VJNTST, the last video encode data 
V_ENDST, the audio encode start time A_STTM, the au- 
dio encode end time A_ENDTM, the audio encode bit 
rate AERATE, the audio encode method A_ENCMD, the 
audio start gap A_STGAP, the audio end gap 
A_ENDGAP, the preceding VOB number BJVOB^NO, 
and the following VOB number F_VOB_NO. 
[0319] The VOB number VOB_NO is a sequential 
number identifying the VOB and the position of the VOB 
in the reproduction sequence of the title scenario. 
[0320] The video encode start time V_STTM is the 
start time of video material encoding. 
[0321] The video encode end time V_ENDTM is the 
end time of video material encoding. 
[0322] The video encode mode V_ENCMD is an en- 
coding mode for declaring whether reverse telecine con- 
version shall be accomplished during video encoding to 
enable efficient coding when the video material is tele- 
cine converted material. 

[0323] The video encode bit rate V_RATE is the av- 
erage bit rate of video encoding. 
[0324] The maximum video encode bit rate V_MRATE 
is the maximum bit rate of video encoding. 
[0325] The GOP structure fixing flag GOP_Fxflag 
specifies whether encoding is accomplished without 
changing the GOP structure in the middle of the video 
encoding process. This is a useful parameter for declar- 
ing whether seamless switch is enabled in a multi-angle 
scene period. 

[0326] The video encode GOP structure GOPST is 
the GOP structure data from encoding. 
[0327] The initial video encode data VJNTST sets the 
in itial value of the VBV buffer (decoder buffer) at the start 
of video encoding, and is referenced during video de- 
coding to initialize the decoding buffer. This is a useful 
parameter for declaring seamless reproduction with the 
preceding encoded video stream. 
[0328] The last video encode data V_ENDST sets the 
end value of the VBV buffer (decoder buffer) at the end 
of video encoding, and is referenced during video de- 
coding to initialize the decoding buffer. This is a useful 
parameter for declaring seamless reproduction with the 
preceding encoded video stream. 
[0329] The audio encode start time A_STTM is the 
start time of audio material encoding. 
[0330] The audio encode end time A.ENDTM is the 
end time of audio material encoding. 



27 



53 



EP 1 202 568 A2 



54 



[0331] The audio encode bit rate A_RATE is the bit 
rate used for audio encoding. 

[0332] The audio encode method A.ENCMD identi- 
fies the audio encoding method as AC-3, MPEG, or lin- 
ear PCM, for example. 

[0333] The audio start gap A_STGAP is the time off- 
set between the start of the audio and video presenta- 
tion at the beginning of a VOB. This is a useful param- 
eter for declaring seamless reproduction with the pre- 
ceding encoded system stream. 
[0334] The audio end gap A_ENDGAP is the time off- 
set between the end of the audio and video presentation 
at the end of a VOB. This is a useful parameter for de- 
claring seamless reproduction with the preceding en- 
coded system stream. 

[0335] The preceding VOB number B.VOB^NO is the 
VOB_NO of the preceding VOB when there is a seam- 
lessly connected preceding VOB. 
[0336] The following VOB number FJVOB_NO is the 
VOB_NO of the following VOB when there is a seam- 
lessly connected following VOB. 
[0337] The operation of a DVD encoder ECD accord- 
ing to the present invention is described below with ref- 
erence to the flow chart in Fig. 34. Note that the steps 
shown with a double line are subroutines. It should be 
obvious that while the operation described below relates 
specifically in this case to the DVD encoder ECD of the 
present invention, the operation described also applies 
to an authoring encoder EC. 

[0338] At step #100, the user inputs the editing com- 
mands according to the user-defined scenario while 
confirming the content of the multimedia source data 
streams St1, St2, and St3. 

[0339] At step #200, the scenario editor 100 gener- 
ates the scenario data St7 containing the above edit 
command information according to the user's editing in- 
structions. 

[0340] When generating the scenario data St7 in step 
#200, the user editing commands related to multi-angle 
and parental lock multi-scene periods in which interleav- 
ing is presumed must be input to satisfy the following 
conditions. 

[0341] First, the VOB maximum bit rate must be set 
to assure sufficient image quality, and the track buffer 
capacity, jump performance, jump time, and jump dis- 
tance of the DVD decoder DCD used as the reproduc- 
tion apparatus of the DVD encoded data must be deter- 
mined. Based on these values, the reproduction time of 
the shortest interleaved unit is obtained from equations 
3 and 4. Based on the reproduction time of each scene 
in the multi-scene period, it must then be determined 
whether equations 5 and 6 are satisfied. If equations 5 
and 6 are not satisfied, the user must change the edit 
commands until equations 5 and 6 are satisfied by, for 
example, connecting part of the following, scene to each 
scene in the multi-scene period. 
[0342] When multi-angle edit commands are used, 
equation 7 must be satisfied for seamless switching, and 



edit commands matching the audio reproduction time 
with the reproduction time of each scene in each angle 
must be entered. If non-seamless switching is used, the 
user must enter commands to satisfy equation 8. 
5 [0343] At step #300, the encoding system controller 
200 first determines whether the target scene is to be 
seamlessly connected to the preceding scene based on 
the scenario data St7. 

[0344] Note that when the preceding scene period is 
10 a multi-scene period comprising plural scenes but the 
presently selected target scene is a common scene (not 
in a multi-scene period), a seamless connection refers 
to seamlessly connecting the target scene with any one 
of the scenes contained in the preceding multi-scene 
15 period. When the target scene is a multi-scene period, 
a seamless connection still refers to seamlessly con- 
necting the target scene with any one of the scenes from 
the same multi-scene period. 

[0345] If step #300 returns NO, I.e., a non-seamless 

20 connection is valid, the procedure moves to step #400. 
[0346] At step #400, the encoding system controller 
200 resets the preceding VOB seamless connection flag 
VOB_Fsb indicating whether there is a seamless con- 
nection between the target and preceding scenes. The 

25 procedure then moves to step #600, 

[0347] On the other hand, if step #300 returns YES, i. 
e., there is a seamless connection to the preceding 
scene, the procedure moves to step #500. 
[0348] At step #500 the encoding system controller 

30 200 sets the preceding VOB seamless connection flag 
VOB_Fsb. The procedure then moves to step #600. 
[0349] At step #600 the encoding system controller 
200 determines whether there is a seamless connection 
between the target and following scenes based on sce- 

35 nariodataSt7. If step #600 returns NO, i.e., a non-seam- 
less connection is valid, the procedure moves to step 
#700. 

[0350] At step #700, the encoding system controller 
200 resets the following VOB seamless connection flag 
40 VOB_Fsf indicating whether there is a seamless con- 
nection with the following scene. The procedure then 
moves to step #900. 

[0351] However, if step #600 returns YES, i.e., there 
is a seamless connection to the following scene, the pro- 

45 cedure moves to step #800. 

[0352] At step #800 the encoding system controller 
200 sets the following VOB seamless connection flag 
VOB_Fsf. The procedure then moves to step #900. 
[0353] At step #900 the encoding system controller 

so 200 determines whether there is more than connection 
target scene, i.e., whether a multi-scene period is se- 
lected, based on the scenario data St7. As previously 
described, there are two possible control methods in 
multi-scene periods: parental lock control whereby only 

55 one of plural possible reproduction paths that can be 
constructed from the scenes in the multi-scene period 
is reproduced, and multi-angle control whereby the re- 
production path can be switched within the multi-scene 
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period to present different viewing angles. 
[0354] If step #900 returns NO, i.e., there are not mul- 
tiple scenes, the procedure moves to step #1000. 
[0355] At step #1000 the multi-scene flag VOB_Fp 
identifying whether the VOB set comprises plural video 5 
objects VOB (a multi-scene period is selected) is reset, 
and the procedure moves to step #1800 for encode pa- 
rameter production. This encode parameter production 
subroutine is described below. 

[0356] However, if step #900 returns YES, there is a 10 
multi-scene connection, the procedure moves to step 
#1100. 

[0357] At step #1100, the multi-scene flag VOB_Fp is 
set, and the procedure moves to step #1200 whereat it 
is judged whether a multi-angle connection is selected, 15 
or not. 

[0358] At step #1200 it is determined whether a 
change is made between plural scenes in the multi- 
scene period, i.e., whether a multi-angle scene period 
is selected. If step #1200 returns NO, i.e., no scene 20 
change is allowed in the multi-scene period as parental 
lock control reproducing only one reproduction path has 
been selected, the procedure moves to step #1300. 
[0359] At step #1300 the multi-angle flag VOB_Fm 
identifying whether the target connection scene is a mul- 25 
ti-ang!e scene is reset, and the procedure moves to step 
#1302. 

[0360] At step #1 302 it is determined whether either 
the preceding VOB seamless connection flag VOB_Fsb 
or following VOB seamless connection flag VOB_Fsf is 30 
set. If step #1 302 returns YES, i.e., the target connection 
scene seamlessly connects to the preceding, the follow- 
ing, or both the preceding and following scenes, the pro- 
cedure moves to step #1304. 

[0361] At step #1 304 the interleave flag VOB_Fi iden- 35 
tifying whether the VOB, the encoded data of the target 
scene, is interleaved is set. The procedure then moves 
to step #1800. 

[0362] However, if step #1 302 returns NO, i.e., the tar- 
get connection scene does not seamlessly connect to *o 
the preceding or following scene, the procedure moves 
to step #1306. 

[0363] At step #1 306 the interleave flag VOB_Fi is re- 
set, and the procedure moves to step #1800. 
[0364] If step #1200 returns YES, however, i. e., there 45 
is a multi-angle connection, the procedure moves to . 
step #1400. 

[0365] At step #1400, the multi-angle flag VOB_Fm 
and interleave flag VOB_Fi are set, and the procedure 
moves to step #1 500. 50 
[0366] At step #1500 the encoding system controller 
200 determines whether the audio and video can be 
seamlessly switched in a multi-angle scene period, i.e., 
at a reproduction unit smaller than the VOB, based on 
the scenario data St7. If step #1500 returns NO, i.e., 55 
non-seamless switching occurs, the procedure moves 
to step #1600. 

[0367] At step #1600 the multi-angle seamless 
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switching flag VOB_FsV indicating whether angle 
changes within the multi-angle scene period are seam- 
less or not is reset, and the procedure moves to step 
#1800. 

[0368] However, if step #1500 returns YES, i.e., 
seamless switching occurs, the procedure moves to 
step #1700. 

[0369] At step #1700 the multi-angle seamless 
switching flag VOB_FsV is set, and the procedure 
moves to step #1800. 

[0370] Therefore, as shown by the flow chart in Fig. 
34, encode parameter production (step #1800) is only 
begun after the editing information is detected from the 
above flag settings in the scenario data St7 reflecting 
the user-defined editing instructions. 
[0371] Based on the user-defined editing instructions 
detected from the above flag settings in the scenario da- 
ta St7, information is added to the encoding information 
tables for the VOB Set units and VOB units as shown in 
Figs. 27 and 28 to encode the source streams, and the 
encoding parameters of the VOB data units shown in 
Fig. 29 are produced, in step #1 800. The procedure then 
moves to step #1900 for audio and video encoding. 
[0372] The encode parameter production steps (step 
#1800) are described in greater detail below referring to 
Figs. 35, 36, 37, and 38. 

[0373] Based on the encode parameters produced in 
step #1800, the video data and audio data are encoded 
in step #1 900, and the procedure moves to step #2000. 
[0374] Note that the sub-picture data is normally in- 
serted during video reproduction on an as-needed ba- 
sis, and contiguity with the preceding and following 
scenes is therefore not usually necessary. Moreover, 
the sub-picture data is normally video information for 
one frame, and unlike audio and video data having an 
extended time-base, sub-picture data is usually static, 
and is not normally presented continuously. Because 
the present invention relates specifically to seamless 
and non-seamless contiguous reproduction as de- 
scribed above, description of sub-picture data encoding 
is omitted herein for simplicity. 

[0375] Step #2000 is the last step in a loop comprising 
steps #300 to step #2000, and causes this loop to be 
repeated as many times as there are VOB Sets. This 
loop formats the program chain VTS_PGC#i to contain 
the reproduction sequence and other reproduction in- 
formation for each VOB in the title (Fig. 16) in the pro- 
gram chain data structure, interleaves the VOB in the 
multi-scene periods, and completes the VOB Set data 
stream and VOB data stream needed for system stream 
encoding. The procedure then moves to step #21 00. 
[0376] At step#2100 theVOB Set data stream is com- 
pleted as the encoding information table by adding the 
total number of VOB Sets VOBS_NUM obtained as a 
result of the loop through step #2000 to the VOB Set 
data stream, and setting the number of titles TITLE_NO 
defining the number of scenario reproduction paths in 
the scenario data St7. The procedure then moves to 
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step #2200. 

[0377] System stream encoding producing the VOB 
(VOB#i) data in the VTS title VOBS (VTSTT.VOBS) 
(Fig. 16) is accomplished in step #2200 based on the 
encoded video stream and encoded audio stream-out- 5 
put from step #1900, and the encode parameters in Fig. 
29. The procedure then moves to step #2300. 
[0378] At step #2300 the VTS information VTSI, VTSI 
management table VTSI_MAT, VTSPGC information ta- 
ble VTS_PGCIT, and the program chain information 10 
VTS_PGCI#i controlling the VOB data reproduction se- 
quence shown in Fig. 16 are produced, and formatting 
to, for example, interleave the VOB contained in the mul- 
ti-scene periods, is accomplished. 

[0379] The encode parameter production subroutine 15 
shown as step #1 800 in Fig. 34B is described next using 
Figs. 35, 36, and 37 using by way of example the oper- 
ation generating the encode parameters for multi-angle 
control. 

[0380] Starting from Fig. 35, the process for generat- 20 
ing the encode parameters of a non-seamless switching 
stream with multi-angle control is described first. This 
stream is generated when step #1500 in Fig. 34 returns 
NO and the following flags are set as shown: VOB_Fsb 
= 1 or VOB_Fsf = 1 , NOB_Fp = 1 , VOB_Fi = 1 , VOB_Fm 25 
= 1 , and VOB_FsV = 0. The following operation produc- 
es the encoding information tables shown in Fig. 27 and 
Fig. 28, and the encode parameters shown in Fig. 29. 
[0381] At step #1812, the scenario reproduction se- 
quence (path) contained in the scenario data St7 is ex- 30 
traded, the VOB Set number VOBS_NO is set, and the 
VOB number VOB_NO is set for one or more VOB in 
the VOB Set. 

[0382] At step #1814 the maximum bit rate ILV_BR of 
the interleaved VOB is extracted from the scenario data 35 
St7, and the maximum video encode bit rate V_MRATE 
from the encode parameters is set based on the inter- 
leave flag VOB_Fi setting (=1). 
[0383] At step #1816, the minimum interleaved unit 
presentation time ILVU_MT is extracted from the see- *o 
nario data St7. 

[0384] At step #1818, the video encode GOP struc- 
ture GOPST values N = 15 and M = 3 are set, and the 
GOP structure fixing flag GOP_Fxf lag is set (= 1 ), based 
on the multi-scene flag VOB_Fp setting (= 1). 45 
[0385] Step #1820 is the common VOB data setting 
routine, which is described below referring to the flow 
chart in Fig. 36. This common VOB data setting routine 
produces the encoding information tables shown in 
Figs. 27 and 28, and the encode parameters shown in so 
Fig. 29. 

[0386] At step #1822 the video material start time 
VOB_VST and video material end time VOB_VEND are 
extracted for each VOB, and the video encode start time 
V_STTM and video encode end time V_ENDTM are 55 
used as video encoding parameters. 
[0387] At step #1824 the audio material start time 
VOB_AST of each VOB is extracted from the scenario 
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data St7, and the audio encode start time A_STTM is 
set as an audio encoding parameter. 
[0388] At step #1826 the audio material end time 
VOB_AEND is extracted foreach VOB from the scenar- 
io data St7, and at a time not exceeding the VOB_AEND 
time. This time extracted at an audio access unit (AAU) 
is set as the audio encode end time A_ENDTM which is 
an audio encoding parameter. Note that the audio ac- 
cess unit AAU is determined by the audio encoding 
method. 

[0389] At step #1828 the audio start gap A_STGAP 
obtained from the difference between the video encode 
start time V_STTM and the audio encode start time 
A_STTM is defined as a system encode parameter. 
[0390] At step #1830 the audio end gap A_ENDGAP 
obtained from the difference between the video encode 
end time V_ENDTM and the audio encode end time 
A_ENDTM is defined as a system encode parameter. 
[0391] At step #1832 the video encoding bit rate 
V_BR is extracted from the scenario data St7, and the 
video encode bit rate V_RATE, which is the average bit 
rate of video encoding, is set as a video encoding pa- 
rameter. 

[0392] At step #1834 the audio encoding bit rate 
A_BR is extracted from the scenario data St7, and the 
audio encode bit rate A_RATE is set as an audio encod- 
ing parameter. 

[0393] At step #1836 the video material type 
VOB_V_KIND is extracted from the scenario data St7. 
If the material is a film type, i.e., a movie converted to 
television broadcast format (so-called telecine conver- 
sion), reverse telecine conversion is set for the video 
encode mode V_ENCMD, and defined as a video en- 
coding parameter. 

[0394] At step #1838 the audio coding method 
VOB_A_KIND is extracted from the scenario data St7, 
and the encoding method is set as the audio encode 
method AJENCMD and set as an audio encoding pa- 
rameter. 

[0395] At step #1840 the initial video encode data 
VJNTST sets the initial value of the VBV buffer to a val- 
ue less than the VBV buffer end value set by the last 
video encode data V_ENDST, and defined as a video 
encoding parameter. 

[0396] At step #1842 the VOB number VOB_NO of 
the preceding connection is set to the preceding VOB 
number B_VOB_NO based on the setting (= 1) of the 
preceding VOB seamless connection flag VOB_Fsb, 
and set as a system encode parameter. 
[0397] At step #1844 the VOB number VOB_NO of 
the following connection is set to the following VOB 
number F_VOB_NO based on the setting (= 1) of the 
following VOB seamless connection flag VOB_Fsf, and 
set as a system encode parameter. 
[0398] The encoding information table and encode 
parameters are thus generated for a multi-angle VOB 
Set with non-seamless multi-angle switching control en- 
abled. 
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[0399] The process for generating the encode param- 
eters of a seamless switching stream with multi-angle 
control is described below with reference to Fig. 37. This 
stream is generated when step #1500 in Fig. 34 returns 
YES and the following flags are set as shown: VOB_Fsb 5 
= 1 or VOB_Fsf = 1 , VOB_Fp = 1 , VOB.Fi = 1 , VOB_Fm 
= 1, and VOB_FsV = 1. The following operation produc- 
es the encoding information tables shown in Fig. 27 and 
Fig. 28, and the encode parameters shown in Fig. 29. 
[0400] The following operation produces the encod- 
ing information tables shown in Fig. 27 and Fig. 28, and 
the encode parameters shown in Fig. 29. 
[0401] At step #1850, the scenario reproduction se- 
quence (path) contained in the scenario data St7 is ex- 
tracted, the VOB Set number VOBS_NO is set, and the 
VOB number VOB_NO is set for one or more VOB in 
the VOB Set. 

[0402] At step #1852 the maximum bit rate ILV_BR. 
of the interleaved VOB is extracted from the scenario 
data St7, and the maximum video encode bit rate 
V_MRATE from the encode parameters is set based on 
the interleave flag VOB_Fi setting (= 1). 
[0403] At step #1854, the minimum interleaved unit 
presentation time ILVU_MT is extracted from the sce- 
nario data St7. 

[0404] At step #1856, the video encode GOP struc- 
ture GOPST values N = 15 and M = 3 are set, and the 
GOP structure fixing flag GOP_Fxf1ag is set(= 1), based 
on the multi-scene flag VOB_Fp setting (= 1). 
[0405] At step #1858, the video encode GOP GOPST 
is set to "closed GOP" based on the multi-angle seam- 
less switching flagVOB_FsV setting (= 1), and the video 
encoding parameters are thus defined. 
[0406] Step #1860 is the common VOB data setting 
routine, which is as described referring to the flow chart 
in Fig. 35. Further description thereof is thus omitted 
here. 

[0407] The encode parameters of a seamless switch- 
ing stream with multi-angle control are thus defined for 
a VOB Set with multi-angle control as described above. 
[0408] The process for generating the encode param- 
eters for a system stream in which parental lock control 
is implemented is described below with reference to Fig. 
38. This stream is generated when step #1200 in Fig. 
34 returns NO and step #1304 returns YES, i.e., the fol- 
lowing flags are set as shown: VOB_Fsb = 1 or VOB_Fsf 
= 1, VOB_Fp = 1, VOB_Fi = 1, VOB_Fm = 0. The fol- 
lowing operation produces the encoding information ta- 
bles shown in Fig. 27 and Fig. 28, and the encode pa- 
rameters shown in Fig. 29. 

[0409] At step #1870, the scenario reproduction se- 
quence (path) contained in the scenario data St7 is ex- 
tracted, the VOB Set number VOBS_NO is set, and the 
VOB number VOB_NO is set for one or more VOB in 
the VOB Set. 

[0410] At step #1872 the maximum bit rate ILV_BR of 
the interleaved VOB is extracted from the scenario data 
St7, and the maximum video encode bit rate V_MRATE 



from the encode parameters is set based on the inter- 
leave flag VOB_Fi setting (= 1). 
[0411] At step #1872 the number of interleaved VOB 
divisions ILV_DIV is extracted from the scenario data 
St7. 

[0412] Step #1876 is the common VOB data setting 
routine, which is as described referring to the flow chart 
in Fig. 35. Further description thereof is thus omitted 
here. 

[0413] The encode parameters of a system stream in 
which parental lock control is implemented are thus de- 
fined for a VOB Set with multi-scene selection control 
enabled as described above. 

[0414] The process for generating the encode param- 
eters for a system stream containing a single scene is 
described below with reference to Fig. 32. This stream 
is generated when step #900 in Fig. 34 returns NO, i.e., 
when VOB_Fp=0. The following operation produces the 
encoding information tables shown in Fig. 27 and Fig. 
28, and the encode parameters shown in Fig. 29. 
[0415] At step #1880, the scenario reproduction se- 
quence (path) contained in the scenario data St7 is ex- 
tracted, the VOB Set number VOBS_NO is set, and the 
VOB number VOB_NO is set for one or more VOB in 
the VOB Set. 

[041 6] At step #1 882 the maximum bit rate ILV_BR of 
the interleaved VOB is extracted from the scenario data 
St7, and the maximum video encode bit rate V_MRATE 
from the encode parameters is set based on the inter- 
leave flag VOB_Fi setting (= 1). 
[0417] Step #1884 is the common VOB data setting 
routine, which is as described referring to the flow chart 
in Fig. 35. Further description thereof is thus omitted 
here. 

[0418] These flow charts for defining the encoding in- 
formation table and encode parameters thus generate 
the parameters for DVD video, audio, and system 
stream encoding by the DVD formatter. 

Decoder flow charts 

Disk-to-stream buffer transfer flow 

[0419] The decoding information table produced by 
the decoding system controller 2300 based on the sce- 
nario selection data St51 is described below referring to 
Figs. 58 and 59. The decoding information table com- 
prises the decoding system table shown in Fig. 58, and 
the decoding table shown in Fig. 59. 
[0420] As shown in Fig. 58, the decoding system table 
comprises a scenario information register and a cell in- 
formation register. The scenario information register 
records the title number and other scenario reproduction 
information selected by the user and extracted from the 
scenario selection data St51. The cell information reg- 
ister extracts and records the information required to re- 
produce the cells constituting the program chain PGC 
based on the user-defined scenario information extract- 
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ed into the scenario information register. 
[0421] More specifically, the scenario information reg- 
ister contains plural sub-registers, i.e., the angle 
number ANGLE_NO_reg, VTS number VTS_NO_reg, 
PGC number VTS_PGCIJMO_reg, audio ID 5 
AUDIO_ID_reg, sub-picture ID SP_ID_reg, and the sys- 
tem clock reference SCR buffer SCR_buffer. 
[0422] The angle number ANGLE_NO_reg stores 
which angle is reproduced when there are multiple an- 
gles in the reproduction program chain PGC. 10 
[0423] The VTS number VTS_NO_reg records the 
number of the next VTS reproduced from among the plu- 
ral VTS on the disk. 

[0424] The PGC number VTS_PGCI_NO_reg 
records which of the plural program chains PGC present f 5 
in the video title set VTS is to be reproduced for parental 
lock control or other applications. 
[0425] The audio ID AUDIOJ D_reg records which of 
the plural audio streams in the VTS are to be repro- . 
duced. 20 
[0426] The sub-picture ID SP_ID_reg records which 
of the plura! sub-picture streams is to be reproduced 
when there are plural sub-picture streams in the VTS. 
[0427] The system clock reference SCR buffer 
SCR_buffer is the buffer for temporarily storing the sys- 25 
tern clock reference SCR recorded to the pack header 
as shown in Fig. 19. As described using Fig. 26, this 
temporarily stored system clock reference SCR is out- 
put to the decoding system controller 2300 as the bit- 
stream control data St63. 30 
[0428] The cell information register contains the fol- 
lowing sub-registers: the ceil block mode CBM_reg, cell 
block type CBT_reg, seamless reproduction flag 
SPF_reg, interleaved allocation flag IAF_reg, STC re- 
setting flag STCDF, seamless angle change flag 35 
SACF_reg, first cell VOBU start address 
C_FVOBU_SA_reg, and last cell VOBU start address 
C_LVOBLLSA_reg. 

[0429] The cell block mode CBM_reg stores a value 
indicating whether plural cells constitute one functional *o 
block. If there are not plural cells in one functional block, 
CBM_reg stores N_BLOCK. If plural cells constitute one 
functional block, the value F_CELL is stored as the 
CBIvLreg value of the first cell in the block, L_CELL is 
stored as the CBM_reg value of the last cell in the block, 
and BLOCK is stored as the CBM_reg of value all cells 
between the first and last cells in the block. 
[0430] The cell block type CBT_reg stores a value de- 
fining the type of the block indicated by the cell block 
mode CBM_reg. If the cell block is a multi-angle block, so 
A_BLOCK is stored; if not, N.BLOCK is stored. 
[0431] The seamless reproduction flag SPF_reg 
stores a value defining whether that cell is seamless 
connected with the cell or cell block reproduced there- 
before. If a seamless connection is specified, SML is 55 
stored; if a seamless connection is not specified, NSML 
is stored. 

[0432] The interleaved allocation flag IAF_reg stores 
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a value identifying whether the cell exists in a contiguous 
or interleaved block. If the cell is part of a an interleaved 
block, ILVB is stored; otherwise NJLVB is stored. 
[0433] The STC resetting flag STCDF defines wheth- 
er the system time clock STC used for synchronization 
must be reset when the cell is reproduced; when reset- 
ting the system time clock STC is necessary, 
STC_RESET is stored; if resetting is not necessary, 
STC.NRESET is stored. 

[0434] The seamless angle change flag SACF_reg 
stores a value indicating whether a cell in a multi-angle 
period should be connected seamlessly at an angle 
change. If the angle change is seamless, the seamless 
angle change flag SACF is set to SML; otherwise it is 
set to NSML 

[0435] The first cell VOBU start address 
C_FVOBU_SA_reg stores the VOBU start address of 
the first cell in a block. The value of this address is ex- 
pressed as the distance from the logic sector of the first 
cell in the VTS title VOBS (VTSTT_VOBS) as measured 
by and expressed (stored) as the number of sectors. 
[0436] The last cell VOBU start address 
C_LVOBU_SA_reg stores the VOBU start address of 
the last cell in the block. The value of this address is 
also expressed as the distance from the logic sector of 
the first cell in the VTS title VOBS (VTSTT_VOBS) 
measured by and expressed (stored) as the number of 
sectors. 

[0437] The decoding table shown in Fig. 59 is de- 
scribed below. As shown in Fig. 59, the decoding table 
comprises the following registers: information registers 
for non-seamless multi-angle control, information regis- 
ters for seamless multi-angle control, a VOBU informa- 
tion register, and information registers for seamless re- 
production. 

[0438] The information registers for non-seamless 
multi-angle control comprise sub-registers 
NSML_AGL_C1_DSTA_reg - NSML_AGL_C9_DSTA_ 
reg. 

[0439] NSML_AGL_C1_DSTA_reg NSML_AGL_ 
C9_DSTA_reg record the NMSL_AGL_C1_DSTA - 
NMSL_AGL_C9_DSTA values in the PCI packet shown 
in Fig. 20. 

[0440] The information registers for seamless multi- 
angle control comprise sub-registers SML_AGL_C1_ 
DSTA_reg - SML_AGL_C9_DSTA_reg. 
[0441] SM L_AGL_C 1 _DSTA_ reg - SML_AGL_C9_ 
DSTA_reg record the SML_AGL_C1_DSTA - SML_ 
AGL_C9_DSTA values in the DSI packet shown in Fig. 
20. 

[0442] The VOBU information register stores the end 
pack address VOBU_EA in the DSI packet shown in Fig. 
20. 

[0443] The information registers for seamless repro- 
duction comprise the following sub-registers: an inter- 
leaved unit flag ILVU_flag_reg, Unit END flag 
UNIT_ENDJIag_reg, Interleaved Unit End Address 
ILVU_EA_reg, Next Interleaved Unit Start Address 
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NTJLVU_SA__reg, the presentation start time of thefirst 
video frame in the VOB (Initial Video Frame Presenta- 
tion Start Time) VOB_V_SPTM_reg, the presentation 
end time of the last video frame in the VOB (Final Video 
Frame Presentation Termination Time) VOB_V_ 5 
EPTM_reg, audio reproduction stopping time 1 
VOB_A_STP_PTM1_reg, audio reproduction stopping 
time 2 VOB_A_STP_PTM2_reg, audio reproduction 
stopping period 1 VO B_ A_G AP_ LE N 1 _ reg , and audio 
reproduction stopping period 2 VOB_A_GAP_ 10 
LEN2_reg. 

[0444] The interleaved unit flag ILVU_flag_reg stores 
the value indicating whether the video object unit VOBU 
is in an interleaved block, and stores ILVU if it is, and 
NJLVU if not. 15 
[0445] < The Unit END flag UNIT_ENDJlag_reg stores 
the value indicating whether the video object unit VOBU 
is the last VOBU in the interleaved unit ILVU. Because 
the interleaved unit ILVU is the data unit for continuous 
reading, the UNIT_END_flag_reg stores END if the VO- 20 
BU currently being read is the last VOBU in the inter- 
leaved unit ILVU, and otherwise stores N_END. 
[0446] The Interleaved Unit End Address 
ILVU_EA_reg stores the address of the last pack in the 
ILVU to which the VOBU belongs if the VOBU is in an 25 
interleaved block. This address is expressed as the 
number of sectors from the navigation pack NV of that 
VOBU. 

[0447] The Next Interleaved Unit Start Address 
NT_ILVU_SA_reg stores the start address of the next 30 
interleaved unit ILVU if the VOBU is in an interleaved 
block. This address is also expressed as the number of 
sectors from the navigation pack NV of that VOBU. 
[0448] The Initial Video Frame Presentation Start 
Time register VOB_V_SPTM_reg stores the time at 35 
which presentation of the first video frame in the VOB 
starts. 

[0449] The Final Video Frame Presentation Termina- 
tion Time register VOB_V_EPTM_reg stores the time at 
which presentation of the last video frame in the VOB *o 
ends. 

[0450] The audio reproduction stopping time 1 
VOB_A_STP_PTM1_reg stores the time at which the 
audio is to be paused to enable ^synchronization, and 
the audio reproduction stopping period 1 45 
VO B_ A_G AP_ L E N 1 _ reg stores the length of this pause 
period. 

[0451] The audio reproduction stopping time 2 
VOB_A_STP_PTM2_reg and audio reproduction stop- 
ping period 2 VOB_A_GAP_LEN2_reg store the same so 
values. 

[0452] The operation of the DVD decoder DCD ac- 
cording to the present invention as shown in Fig. 26 is 
described next below with reference to the flow chart in 
Fig. 60. 55 
[0453] At step #310202 it is first determined whether 
a disk has been inserted. If it has, the procedure moves 
to step #310204. 
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[0454] At step #31 0204, the volume file structure VFS 
(Fig. 21) is read, and the procedure moves to step 
#310206. 

[0455] At step #310206, the video manager VMG 
(Fig. 21) is read and the video title set VTS to be repro- 
duced is extracted. The procedure then moves to step 
#310208. 

[0456] At step #310208, the video title set menu ad- 
dress information VTSM_C_ADT is extracted from the 
VTS information VTSI, and the procedure moves to step 
#310210. 

[0457] At step #310210 the video title set menu 
VTSrVLVOBS is read from the disk based on the video 
title set menu address information VTSM_C_ADT, and 
the title selection menu is presented. 
[0458] The user is thus able to select the desired title 
from this menu in step #31 0212. If the titles include both 
contiguous titles with no user-selectable content, and 
titles containing audio numbers, sub-picture numbers, 
or multi-angle scene content, the user must also enter 
the desired angle number. Once the user selection is 
completed, the procedure moves to step #310214. 
[0459] At step #310214, the VTS_PGCI #i program 
chain (PGC) data block corresponding to the title 
number selected by the user is extracted from the VT- 
SPGC information table VTS_PGCIT, and the proce- 
dure moves to step #310216. 
[0460] Reproduction of the program chain PGC then 
begins at step #310216. When program chain PGC re- 
production is finished, the decoding process ends. If a 
separate title is thereafter to be reproduced as deter- 
mined by monitoring key entry to the scenario selector, 
the title menu is presented again (step #310210). 
[0461] Program chain reproduction in step #310216 
above is described in further detail below referring to 
Fig. 61. The program chain PGC reproduction routine 
consists of steps #31 030, #31 032, #31 034, and #31 035 
as shown. 

[0462] At step #31030 the decoding system table 
shown in Fig. 58 is defined. The angle number 
ANGLE_NO_reg, VTS number VTS_NO_reg, PGC 
number VTS_PGCI_NO_reg, audio ID AUDIO_ID_reg, 
and sub-picture ID SP_ID_reg are set according to the 
selections made by the user using the scenario selector 
2100. 

[0463] Once the PGC to be reproduced is determined, 
the corresponding cell information (PGC information en- 
tries C_PBI #j) is extracted and the cell information reg- 
ister is defined. The sub-registers therein that are de- 
fined are the cell block mode CBM_reg, cell block type 
CBT_reg, seamless reproduction flag SPF_reg, inter- 
leaved allocation flag IAF__reg t STC resetting flag 
STCDF, seamless angle change flag SACF_reg, first 
cell VOBU start address C_FVOBU_SA_reg, and last 
cell VOBU start address C_LVOBU_SA_reg. 
[0464] Once the decoding system table is defined, the 
process transferring data to the stream buffer (step 
#31032) and the process decoding the data in the 
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stream buffer (step #31034) are activated in parallel. 
[0465] The process transferring data to the stream 
buffer (step #31032) is the process of transferring data 
from the recording medium M to the stream buffer 2400. 
This is, therefore, the processing of reading the required 
data from the recording medium M and inputting the da- 
ta to the stream buffer 2400 according to the user-se- 
lected title information and the playback control informa- 
tion (navigation packs NV) written in the stream. 
[0466] The routine shown as step #31 034 is the proc- 
ess for decoding the data stored to the stream buffer 
2400 (Fig. 26), and outputting the decoded data to the 
video data output terminal 3600 and audio data output 
terminal 3700. Thus, is the process for decoding and 
reproducing the data stored to the stream buffer 2400. 
[0467] Note that step #31 032 and step #31 034 are ex- 
ecuted in parallel. 

[0468] The processing unit of step #31 032 is the cell, 
and as processing one cell is completed, it is determined 
in step #31035 whether the complete program chain 
PGC has been processed. If processing the complete 
program chain PGC is not completed, the decoding sys- 
tem table is defined for the next cell in step #31 030. This 
loop from step #31 030 through step #31 035 is repeated 
until the entire program chain PGC is processed. 
[0469] The stream buffer data transfer process of step 
#31032 is described in further detail below referring to 
Fig. 62. The stream buffer data transfer process (step 
#31032) comprises steps #31040, #31042, #31044, 
#31046, and #31048 shown in the figure. 
[0470] At step #31040 it is determined whether the 
cell is a multi-angle cell. If not, the procedure moves to 
step #30144. 

[0471] At step #31 044 the non-multi-angle cell decod- 
ing process is executed. 

[0472] However, if step #30140 returns YES because 
the cell is a multi-angle cell, the procedure moves to step 
#30142 where the seamless angle change flag SACF is 
evaluated to determine whether seamless angle repro- 
duction is specified. 

[0473] If seamless angle reproduction is specified, the 
seamless multi-angle decoding process is executed In 
step #30146. If seamless angle reproduction is not 
specified, the non-seamless multi-angle decoding proc- 
ess is executed in step #30148. 
[0474] The non-multi-angle cell decoding process 
(step #31044, Fig. 62) is described further below with 
reference to Fig. 63. Note that the non-multi-angle cell 
decoding process (step #31044) comprises the steps 
#31050, #31052, and #31054. 
[0475] The first step #31 050 evaluates the interleaved 
allocation flag lAF^reg to determine whether the cell is 
in an interleaved block. If it is, the non-multi-angle inter- 
leaved block process is executed in step #31052. 
[0476] The non-multi-angle interleaved block process 
(step #31052) processes scene branching and connec- 
tion where seamless connections are specified in, for 
example, a multi-scene period. 



[0477] However, if the cell is not in an interleaved 
block, the non-multi-angle contiguous block process is 
executed in step #31054. Note that the step #31054 
process is the process executed when there is no scene 
5 branching or connection. 

[0478] The non-multi-angle interleaved block process 
(step #31052, Fig. 63) is described further below with 
reference to Fig. 64. 

[0479] At step #31060 the reading head 2006 is 
10 jumped to the first cell VOBU start address 
C_FVOBU_SA read from the C_FVOBU_SA_reg regis- 
ter. 

[0480] More specifically, the address data 
C_FVOBU_SA_reg stored in the decoding system con- 

15 trailer 2300 (Fig. 26) is input as bitstream reproduction 
control signal St53 to the reproduction controller 2002. 
The reproduction controller 2002 thus controls the re- 
cording media drive unit 2004 and signal processor 
2008 to move the reading head 2006 to the specified 

20 address, data is read, error correction code ECC and 
other signal processing is accomplished by the signal 
processor 2008, and the cell start VOBU data is output 
as the reproduced bitstream St61 to the stream buffer 
2400. The procedure then moves to step #31062. 

25 [0481] At step #31062 the DSI packet data in the nav- 
igation pack NV(Fig. 20) is extracted in the stream buff- 
er 2400, the decoding table is defined, and the proce- 
dure moves to step #31 064. The registers set in the de- 
coding table are the ILVU_EA_reg, NTJLVU_SA_reg, 

30 VOB_V_SPTM_reg, VOB_V_EPTM_reg, VOB_A_ 
STP_PTM1_reg, VOB_A_STP_PTM2_reg, VOB_A_ 
GAP_LEN1_reg, and VOB_A_GAP_LEN2_reg. 
[0482] At step #31 064 the data from the first cell VO- 
BU start address C_FVOBU_SA_reg to the ILVU end 

35 pack address ILVU_EA_reg, i.e., the data for one inter- 
leaved unit ILVU, is transferred to the stream buffer 
2400. The procedure then moves to step #31066. 
[0483] More specifically, the address data 
ILVU_EA_reg stored in the decoding system controller 

40 2300 (Fig. 26) is supplied to the reproduction controller 
2002. The reproduction controller 2002 thus controls the 
recording media drive unit 2004 and signal processor 
2008 to read the data to the ILVU_EA_reg address, and 
after error correction code ECC and other signal 

45 processing is accomplished by the signal processor 
2008, the data for the first ILVU in the cell is output as 
the reproduced bitstream St61 to the stream buffer 
2400. It is thus possible to output the data for one con- 
tiguous interleaved unit ILVU on the recording medium 

so M to the stream buffer 2400. 

[0484] At step #31 066 it is determined whether all in- 
terleaved units in the interleaved block have been read 
and transferred. If the interleaved unit ILVU processed 
is the last ILVU in the interleaved block, "0x7FFFFFFF" 

55 indicating termination is set to the next-ILVU start ad- 
dress NT_ILVU_SA_reg as the next read address. If all 
interleaved units in the interleaved block have thus been 
processed, the procedure moves to step #31068. 
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[0485] At step #31 068 the reading head 2006 is again 
jumped to the address NT_ILVU_SA_reg of the next in- 
terleave unit to be reproduced, and the procedure loops 
back to step #31062. Note that this jump is also accom- 
plished as described above, and the loop from step 5 
#31062 to step #31068 is repeated. 
[0486] However, if step #31066 returns YES, i.e., all 
interleaved unit ILVU in the interleaved block have been 
transferred, step #31052 terminates. 
[0487] The non-multi-angle interleaved block process 10 
(step #31052) thus transfers the data of one cell to the 
stream buffer 2400. 

[0488] The non-multi-angle contiguous block process 
is executed in step #31054, Fig. 63, is described further 
below with reference to Fig. 65. 15 
[0489] At step #31070 the reading head 2006 is 
jumped to the first cell VOBU start address 
C.FVOBILSA read from the C_FVOBU_SA_reg regis- 
ter. This jump is also accomplished as described above; 
and the loop from step #31 072 to step #31076 is initiat- 20 
ed. 

[0490] At step #31 072 the DSI packet data in the nav- 
igation pack NV (Fig. 20) is extracted in the stream buff- 
er 2400, the decoding table is defined, and the proce- 
dure moves to step #31074. The registers set in the de- 25 
coding table are the VOBU_EA_reg, VOB_V_ 
SPTM_reg, VOB_V_EPTM_reg, VOB_A_STP_PTM1_ 
reg, VOB_A_STP_PTM2_ reg, VOB_A_GAP_LEN1_ 
reg, and VOB_A_GAP_LEN2_reg. 

[0491] At step #31 074 the data from the first cell VO- 30 
BU start address C_FVOBU_SA_reg to the end pack 
address VOBU_EA_reg, i.e., the data for one video ob- 
ject unit VOBU, is transferred to the stream buffer 2400. 
The procedure then moves to step #31 076. The data for 
one video object unit VOBU contiguously arrayed to the 35 
recording medium M can thus be transferred to the 
stream buffer 2400. 

[0492] At step #31 076 it is determined whether all cell 
data has been transferred. If all VOBU in the cell has 
not been transferred, the data for the next VOBU is read *o 
continuously, and the process loops back to step 
#31070. 

[0493] However, if all VOBU data in the cell has been 
transferred as determined by the C_LVOBU_SA_reg 
value in step #31076, the non-multi-angle contiguous 45 
block process (step #31054) terminates. This process 
thus transfers the data of one ceil to the stream buffer 
2400. 

Decoding flows in the stream buffer so 

[0494] The process for decoding data in the stream 
buffer 2400 shown as step #31034 in Fig. 61 is de- 
scribed below referring to Fig. 66. This process (step 
#31034) comprises steps #31110, #31112, #31114, and 55 
#31116. 

[0495] At step #31110 data is transferred in pack units 
from the stream buffer 2400 to the system decoder 2500 



(Fig. 26). The procedure then moves to step #31112. 
[0496] At step #31 1 1 2 the pack data is from the stream 
buffer 2400 to each of the buffers, i.e., the video buffer 
2600, sub-picture buffer 2700, and audio buffer 2800. 
[0497] At step #31112 the Ids of the user-selected au- 
dio and sub-picture data, i.e., the audio ID 
AUDIO_ID_reg and the sub-picture ID SP_ID_reg 
stored to the scenario information register shown in Fig. 
58, are compared with the stream ID and sub-stream ID 
read from the packet header (Fig. 1 9), and the matching 
packets are output to the respective buffers. The proce- 
dure then moves to step #31114. 
[0498] The decode timing of the respective decoders 
(video, sub-picture, and audio decoders) is controlled in 
step #31114, i.e., the decoding operations of the decod- 
ers are synchronized, and the procedure moves to step 
#31116. 

[0499] Note that the decoder synchronization process 
of step #31114 is described below with reference to Fig. 
15. 

[0500] The respective elementary strings are then de- 
coded at step #31116. The video decoder 3801 thus 
reads and decodes the data from the video buffer, the 
sub-picture decoder 3100 reads and decodes the data 
from the sub-picture buffer, and the audio decoder 3200 
reads and decodes the data from the audio buffer. 
[0501] This stream buffer data decoding process then 
terminates when these decoding processes are com- 
pleted. 

[0502] The decoder synchronization process of step 
#31114, Fig. 66, is described below with reference to 
Fig. 15. This processes comprises steps #31120, 
#31122, and #31124. 

[0503] At step #31120 it is determined whether a 
seamless connection is specified between the current 
cell and the preceding cell. If a seamless connection, 
the procedure moves to step #31122, if not, the proce- 
dure moves to step #31124. 

[0504] A process synchronizing operation for produc- 
ing seamless connections is executed in step #31122, 
and a process synchronizing operation for non-seam- 
less connections is executed In step #31124. 

System encoder 

[0505] In the embodiment described below, plural 
buffers, including a stream buffer 2400, video buffer 
2600, audio buffer 2800, and reordering buffer 3300 as 
shown in Fig. 26, are used for the single time-share con- 
trolled buffer of the DVD decoder DCD in the present 
invention. 

[0506] Note that in the following description the actual 
buffer means made from semiconductor memory devic- 
es or similar physical means are referred to as "physical 
buffers," and the buffer means to which different data 
are stored by time-share controlled use of the physical 
buffers are referred to as "functional buffers." Note that 
sub-picture data decoding is completed instantaneous- 
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ly, and the load imposed on DVD decoder DCD opera- 
tion can thus be ignored in comparison with load im- 
posed by the encoded audio and video streams. The 
description of the present embodiment below is there- 
fore limited to a single encoded video stream and a sin- 5 
gle encoded audio stream for simplicity. 
[0507] Shown in Fig. 39 are the simulated results of 
data input/output to the video buffer 2600 and audio 
buffer 2800 of the DVD decoder DCD, and the sequence 
in which the DVD encoder ECD multiplexes the encoded 10 
video stream St27 and the encoded audio stream St31 
to generate the corresponding bitstream. Note that the 
progression of time is shown on the horizontal axis T. 
[0508] The frame G1 shown at the top row in Fig. 39 
shows the packetizing of the encoded video stream St27 1 5 
by the DVD encoder ECD. Each block V in frame G1 
indicates a video packet V. The vertical axis indicates 
the input transfer rate to the video buffer 2600, and the 
horizontal axis, time-base T, indicates the transfer time. 
The area of each video packet represents the data size 20 
of the packet. The audio packets A are similarly shown 
with the area of the audio packet also indicating the 
packet size. Note, however, that while the audio packets 
appear to be larger than the video packets V, i.e., con- 
tain more data, the audio packets and video packets are 25 
all the same size. 

[0509] Data input/output to the video buffer 2600 of 
the DVD decoder DCD is shown on the second row of 
Fig. 39. The vertical axis Vdv here indicates the accu- 
mulated video data volume Vdv in the video buffer 2600. 30 
[0510] More specifically, the first video packetVinthe 
encoded video stream St71 input to the video buffer 
2600 is input at time Tb1 . The last video packet V in the 
encoded video stream St71 is input at time Tvf . Line SVi 
thus indicates the change in the video data volume Vdv 35 
accumulated in the video buffer 2600 at the front of the 
encoded video stream St71, and line SVf indicates the 
change in the video data volume Vdv accumulated in 
the video buffer 2600 at the end of the encoded video 
stream St71. Thus, the slopes of lines SVi and SVf in- 40 
dicate the input rate to the video buffer 2600. Line BCv 
indicates the maximum accumulation capacity (storage 
capacity) of the video buffer 2600. 
[0511] Note that lines BCv and BCa are determined 
based on data written to the system stream header ac- 45 
cording to the MPEG standard. 
[0512] The accumulated video data volume Vdv in the 
video buffer 2600 increases linearly, and at time Td1 the 
first block d1 of video data is batch transferred in a first- 
in first-out (FIFO) fashion to the video decoder 3801 so 
whereby it is consumed for decoding. As a result, the 
accumulated video data volume Vdv is reduced to (BCv 
- d1 ), and then continues to accumulate. Note that while 
this example shows the accumulated video data volume 
Vdv at time Td1 to have reached the maximum storage 55 
capacity BCv of the video buffer 2600, it is not necessary 
for the accumulated video data volume Vdv to have 
reached the maximum storage capacity BCv when de- 
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coding begins, and may obviously be less than the max- 
imum storage capacity BCv. 

[0513] Part of the data d1 transferred to the video buff- 
er 2600, specifically the data at point B at the top end 
of the dotted line having the same slope as line SVi and 
intersecting the time-base at intersection tb, was data 
input at time Tb. Thus, the data block d1 first decoded 
is the data input between time Tb1 to Tb2. Furthermore, 
when data input time Tb2 is later than decoding time 
Td1, a data underflow state occurs in the video buffer ' 
2600 at time Td1. 

[0514] The variation in the per-picture encoded data 
quantity is great in an MPEG-compressed encoded vid- 
eo stream, and temporary depletion of large amounts of 
encoded data may occur. To prevent a data underflow 
state from occurring in the video buffer in such cases, it 
is necessary to write as much data as possible to the 
video buffer 2600. The time required for data transfer is 
thus called the video buffer verifier delay vbv_delay. 
[0515] The third row in Fig. 39 shows the audio data 
packetizing process. As with the video data packets in 
the first row, the frames A indicate the audio packets A, 
the size of which is equal to the size of the video packets 
V. 

[0516] The fourth row simulates the results of data in- 
put/output to the audio buffer 2800 similar to the results 
of data input/output to the video buffer 2600 in the sec- 
ond row. The vertical axis here indicates the accumulat- 
ed audio data volume Vda in the audio stream buffer 
2800. 

[0517] Note that in Fig. 39 time Tvp1 is the video pres- 
entation start time, Tap1 is the audio presentation start 
time, Fv is the video frame reproduction time, and Fa is 
the audio frame reproduction time Fa. 
[0518] At time Tad1 , the first audio packet A in the en- 
coded audio stream St75 is input to the audio buffer 
2800. Line SAi thus indicates the change in the audio 
data volume Vda accumulated in the audio buffer 2800 
at the front of the encoded audio stream St75, and line 
SAf indicates the change in the audio data volume Vda 
accumulated in the audio buffer 2800 at the end of the 
encoded audio stream St75. Thus, the slopes of lines 
SAi and SAf indicate the input rate to the audio buffer 
2800. Line BCa indicates the maximum accumulation 
capacity (storage capacity) of the 2800. Note that the 
maximum storage capacity BCa is obtained in the same 
manner as the maximum storage capacity BCv of the 
video buffer 2600. 

[0519] The audio access unit, i.e., the audio frame 
(which is also the audio compression unit), is generally 
constant in the audio stream. A data overflow state oc- 
curs in the audio buffer 2800 if the encoded audio 
stream St75 is input to the audio buffer 2800 in a short 
period at a rate exceeding the consumption rate, and 
the input volume thus exceeds the maximum storage 
capacity BCa of the audio buffer 2800. When this hap- 
pens, the next audio packet A cannot be input until audio 
data stored in the audio buffer 2800 is consumed, i.e., 
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decoded. 

[0520] Furthermore, because the video packets V and 
audio packets A are contiguous in a single bitstream, 
the following video packet V cannot be input to the video 
buffer 2600 even through the video buffer 2600 itself is 
not in a data overflow state if a data overflow state oc- 
curs in the audio buffer 2800. Thus, a data overflow state 
in the audio buffer 2800 may create a data underflow 
state in the video buffer 2600 depending on the duration 
of the data overflow state. 

[0521] Therefore, to prevent an audio buffer overflow, 
data input to the audio buffer 2800 is restricted when the 
sum of the data accumulated in the audio buffer and the 
data size of the packet exceeds the maximum audio 
buffer capacity. More specifically, the present embodi- 
ment transfers only the packet(s) containing the (frame) 
data required by the audio decode time, and does not 
permit inputting more than the required amount of data 
to the audio buffer. However, because of the difference 
in the data size of the packets (approx. 2 KB) and the 
audio frame (1536 bytes at 384 Kbps with Dolby AC-3 
coding), the data for the frame following the current 
frame is simultaneously input. 
[0522] Thus, as shown by the audio data packet 
stream (row three, Fig. 39 and the audio buffer input/ 
output timing (row four, Fig. 39), only approximately one 
audio frame of data is input to the audio buffer 2800 be- 
fore the audio decode time. 

[0523] Because of the characteristics of an MPEG- 
compressed video stream, decoding normally starts at 
video frame reproduction time Fv before the first video 
presentation start time Tvp1 , and the audio data is input 
to the audio buffer 2800 at audio frame reproduction 
time Fa before decoding starts, i.e., before audio pres- 
entation start time Tap1 . The video stream is thus input 
to the video buffer 2600 approximately (video buffer ver- 
ifier delay vbv_delay+ video frame reproduction time Fv 
- audio frame reproduction time Fa) before audio stream 
input begins. 

[0524] The fifth row in Fig. 39 shows the results of in- 
terleaving the video packet stream G1 (row 1) with the 
audio packet stream G2 (row 3). The audio and video 
packets are interleaved by multiplexing referenced to 
the respective input times to the audio and video buffers. 
[0525] For example, Tb1 is the Index for the buffer in- 
put time of the first pack in the encoded video stream, 
and Ta1 is the index for the buffer input time of the first 
pack in the encoded audio stream. The packed data is 
then multiplexed referenced to the buffer input time of 
the data in the packs to the audio and video buffers. Be- 
cause the encoded video stream is thus input to the vid- 
eo buffer 2600 at approximately the vbv_delay plus one 
video frame minus one audio frame, plural video frames 
are contiguous at the front of the system stream. There 
is a similar series of audio packets at the end of the sys- 
tem stream equivalent to approximately the lead time at 
which the video stream is buffered before the encoded 
audio stream. 
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[0526] Note again that a data overflow state occurs in 
the audio buffer 2800 if the encoded audio stream St75 
is input to the audio buffer 2800 in a short period at a 
rate exceeding the consumption rate, and the input vol- 

s ume thus exceeds the maximum storage capacity BCa 
of the audio buffer 2800. When this happens, the next 
audio packet A cannot be input until audio data stored 
^in the audio buffer 2800 is consumed, i.e., decoded. 
Gaps therefore occur at the end of the system stream 

10 when only the audio packets are being transferred. 
[0527] For example, if the video bit rate is 8 Mbps, the 
video buffer capacity is 224 KB, and 224 KB of video 
data are buffered before video decoding starts in the 
DVD system, the video buffer verifier delay vbv_delay 

15 will be approximately 21 9 msec. If NTSC video and AC- 
3 audio coding are used, one NTSC video frame is ap- 
proximately 33 msec, and one AC-3 audio frame is ap- 
proximately 32 msec. At the head of the system stream 
in this example the video stream leads the audio stream 

20 by approximately 220 msec (= 219 msec + 33 msec - 
32 msec), and video packets are arrayed contiguously 
for this period. 

[0528] The audio packets continue in a similar series 
at the end of the system stream for the lead time of the 

25 encoded video stream to the encoded audio stream. 
[0529] By thus producing and recording the system 
streams, audio and video reproduction can be accom- 
plished without creating a data underflow state in the 
video buffer of the DVD decoder shown in Fig. 26. 

30 [0530] Movies and other titles can be recorded to an 
optical disk by a DVD system using this type of MPEG 
system stream. However, if plural titles implementing 
parental lock control, director's cut selections, and other 
features are recorded to a single optical disk, it may be 

35 necessary to record ten or more titles to the disk. This 
may require the bit rate to be dropped with the incum- 
bent loss of image quality. 

[0531] However, by sharing the system streams com- 
mon to plural titles, e.g., titles implementing parental 

40 lock control, director's cut selections, and other fea- 
tures, and discretely recording for each of the plural titles 
only those scenes that are unique to those titles, it is 
possible to record plural different titles to a single optical 
disk without reducing the bit rate, and thereby without 

45 loss of image quality. This method thus makes it possi- 
ble, for example, to record plural titles for different coun- 
tries, cultures, or language groups to a single optical 
disk without reducing the bit rate and therefore without 
loss of image quality. 

so [0532] An example of a title stream providing far pa- 
rental lock control is shown in Fig. 40. When so-called 
"adult scenes" containing sex, violence, or other scenes 
deemed unsuitable for children are contained in a title 
implementing parental lock control, the title stream is re- 

55 corded with a combination of common system streams 
SSa, SSb, and SSe, an adult-oriented system stream 
SSc containing the adult scenes, and a child-oriented 
system stream SSd containing only the scenes suitable 



37 



73 



EP 1 202 568 A2 



74 



for children. Title streams such as this are recorded as 
a multi-scene system stream containing the adult-ori- 
ented system stream SSc and the child-oriented system 
stream SSd arrayed to the multi-scene period between 
common system streams SSb and SSe. 5 
[0533] The relationship between each of the compo- 
nent titles and the system stream recorded to the pro- 
gram chain PGC of a title stream thus comprised is de- 
scribed below. 

[0534] The adult- oriented title program chain PGC1 
comprises in sequence the common system streams 
SSa and SSb, the adult-oriented system stream SSc, 
and the common system stream SSe. The child-orient- 
ed title program chain PGC2 comprises in sequence the 
common system streams SSa and SSb; the child-orient- 
ed system stream SSd, and the common system stream 
SSe. 

[0535] To share system streams within titles compris- 
ing multi-scene periods such as this, and to divide the 
system stream as needed for authoring, it is also nec- 
essary to be able to connect and contiguously repro- 
duce these system streams. When system streams are 
connected and contiguously reproduced, however, 
pauses in the video presentation (freezes) occur at the 
system stream connections, and seamless reproduction 
presenting a natural flow of a single title can be difficult 
to achieve. 

[0536] Data input/output to the video buffer 2600 of 
the DVD decoder DCD shown in Fig. 26 during contig- 
uous reproduction is shown in Fig. 41. In Fig. 41, block 
Ga shows the data input/output to the video buffer 2600 
when encoded video stream Sva and encoded video 
stream Svb are input to the DVD decoder DCD. Block 
Gb shows the video packet streams of encoded video 
stream Sva and encoded video stream Svb. Block Gc 
shows the interleaved system streams Sra and Srb. 
Note that blocks Ga, Gb, and Gc are arranged refer- 
enced to the same time-base T as that shown in Fig. 39. 
[0537] In block Ga the vertical axis shows the accu- 
mulated video data volume Vdv in the video buffer, and 
slope Sva indicates the input rate to the video buffer 
2600. Where the video data volume Vdv accumulated 
in the video buffer 2600 is shown to decrease in block 
Ga therefore indicates data consumption, i.e., that data 
has been output for decoding. 
[0538] Time T1 also indicates the input end time of 
the last video packet V1 in the system stream Sra (block 
Gc), time T3 indicates the input end time of the last audio 
packet A1 in system stream Srb, and time Td indicates 
the first decode time of encoded video stream Svb 
(block Ga). 

[0539] Of the two system streams, the encoded video 
stream Sva and the encoded audio stream Saa, consti- 
tuting system stream Sra, the encoded video stream 
Sva is input to the video buffer 2600 before the encoded 
audio stream Saa is input to the audio buffer 2800 as 
described above. A series of audio packets A therefore 
remains at the end of the system stream Sra. 



[0540] A data overflow state also occurs in the audio 
buffer 2800 if audio packets A exceeding the capacity 
of the audio buffer 2800 are input thereto. When this oc- 
curs, the next audio packet cannot be buffered until an 
equivalent amount of audio data is consumed, i.e., de- 
coded. 

[0541] The first video packet V2 in system stream Srb 
therefore cannot be input to the video buffer 2600 until 
input of the last audio packet A1 in the system stream 
Sra is completed. As a result, video stream input to the 
video buffer 2600 cannot be continued due to the inter- 
ference from audio packet A1 during the period from T1 , 
the input end time of the last video packet V1 in system 
stream Sra, to T3, the input end time of the last audio 
packet A1 in system stream Sra. 
[0542] In the following example it is assumed that the 
video bit rate of the DVD system is 8 Mbps, the video 
buffer capacity is 224 KB, the audio buffer capacity is 4 
KB, the audio data is encoded with Dolby AC-3 com- 
pression, and the compression bit rate is 384 Kbps. In 
AC-3 audio compression, the reproduction time of one 
audio frame is 32 msec, corresponding to a data size of 
1536 bytes/frame, and two audio frames can therefore 
be stored in the audio buffer. 

[0543] Because the number of audio frames that can 
be stored in the audio buffer is two, the earliest possible 
time T3, which is the input end time of the last audio 
packet A1 in system stream Sra, is at the (reproduction 
start time of the last audio frame in system stream Sra) 
- (reproduction time of two audio frames). The reproduc- 
tion start time of the last audio frame in system stream 
Sra is also approximately one audio frame earlier than 
the presentation start time of the first frame in the en- 
coded video stream Svb of system stream Srb. The 
presentation start time of encoded video stream Svb is 
at the video buffer verifier delay vbv_delay plus one vid- 
eo frame after the input end time T1 of the last video 
packet V1 in system stream Sra. 
[0544] Therefore, if 224 KB of video data is buffered 
by the time video decoding starts, the video buffer ver- 
ifier delay vbv_delay is approximately 219 msec. If NT- 
SC video and AC-3 audio coding are used, one NTSC 
video frame is approximately 33 msec, and one AC-3 
audio frame is approximately 32 msec. Thus, there is 
approximately 156 msec (= 219 msec + 33 msec - 32 
msec - 2 x 32 msec) from the input end time T1 of the 
last video packet V1 in system stream Sra to the input 
end time T3 of the last audio packet A1 in system stream 
Sra. The encoded video stream Svb cannot be input to 
the video buffer 2600 during this approximately 156 
msec period. 

[0545] Therefore, because all decode data d1 is not 
input to the video buffer 2600 at time Td, a data under- 
flow state occurs in the video buffer 2600. In such cases 
the video presentation is intermitted, video freezing oc- 
curs, and correct video presentation is interrupted. 
[0546] Thus, when plural system streams are con- 
nected and contiguously decoded to reproduce a single 
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contiguous sequence of scenes from plural titles com- 
prising a commonly shared system stream and plural 
system streams containing content discretely encoded 
for specific titles, video freezing apparent as pauses in 
the video presentation at system stream connections 
can occur, and it is not always possible to seamlessly 
reproduce such plural system streams as a single con- 
tiguous title. 

[0547] When plural different system streams SSc and 
SSd are connected to one common system stream SSe 
as shown in Fig. 40, a time difference occurs between 
the video reproduction time and the audio reproduction 
time because of the offset between the audio and video 
frame reproduction times, and this time difference varies 
according to the reproduction path. As a result, buffer 
control fails at the connection, video reproduction freez- 
es or the audio reproduction is muted, and seamless re- 
production is not possible. 

[0548] This problem is considered below with refer- 
ence to Fig. 42 as it applies to the parental lock control 
example shown in Fig. 40. In Fig. 42 SScv and SSca 
represent the reproduction times of the video and audio 
frame unit streams in adult-oriented system stream SSc. 
SSdv and SSda similarly represent the reproduction 
times of the video and audio frame unit streams in the 
child-oriented system stream SSd. 
[0549] As described above, if NTSC video and AC-3 
audio coding are used, one NTSC video frame is ap- 
proximately 33 msec, and one AC-3 audio frame is ap- 
proximately 32 msec, and the audio and video repro- 
duction times therefore do not match. As a result, a dif- 
ference occurs in the video reproduction time, which is 
an integer multiple of the video frame reproduction time, 
and the audio reproduction time, which is an integer mul- 
tiple of the audio frame reproduction time. This repro- 
duction time different is expressed as Tc in the adult- 
oriented system stream SSc, and time Td in the child- 
oriented system stream SSd. This difference also varies 
according to the change in the reproduction time of the 
reproduction paths, and Tc_Td. 
[0550] Therefore, when plural system streams are 
connected with a single system stream as described 
above with parental lock control and director's cut titles, 
there is a maximum reproduction gap of one frame in 
the audio and video reproduction times at the points 
where the system streams branch and connect. 
[0551] This reproduction gap is described next with 
reference to Fig. 43. The top program chain PGC1 rep- 
resents the reproduction path of the adult-oriented sys- 
tem stream. SScv and SSev represent the reproduction 
times of the video frame unit streams in adult-oriented 
system stream SSc and common system stream SSe, 
and SSca and SSea represent the reproduction times 
of the audio frame unit streams in adult-oriented system 
stream SSc and common system stream SSe. 
[0552] These frame unit reproduction times are ex- 
pressed in the figure by the line segments ended with 
arrows on both ends. 
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[0553] The video stream SScv of the adult-oriented 
system stream SSc in this example ends after 3 frames, 
and is followed by the common system stream SSe 
starting at frame 4 with the first frame of the video stream 

5 SSev. The audio stream SSca likewise ends at frame 4, 
and the first frame of the common audio stream SSea 
starts from frame 5. The resulting difference in the frame 
reproduction times between the audio and video 
streams produces a reproduction gap of time Tc equiv- 

10 alent to a maximum one frame between the video 
stream and the audio stream when these two streams 
SSc and SSe are connected. 

[0554] The bottom program chain PGC2 similarly rep- 
resents the reproduction path of the child-oriented sys- 

15 tern stream. SSdv and SSev represent the reproduction 
times of the video frame unit streams in child-oriented 
system stream SSd and common system stream SSe, 
and SSda and SSea represent the reproduction times 
of the audio frame unit streams in child-oriented system 

20 stream SSd and common system stream SSe. 

[0555] As with the adult-oriented program chain' 
PGC1 above, a reproduction gap of time Td equivalent 
to a maximum one frame between the video stream and 
the audio stream occurs when these two streams SSd 

25 and SSe are connected. When the reproduction paths 
to the common system streams differ before the con- 
nection point as shown in Fig. 43, it is possible to adjust 
the reproduction start times of the connected common 
audio and video streams to the reproduction start time 

30 difference of at least one reproduction path. As shown 
in this figure, the audio and video end times of the adult- 
oriented system stream SSc are the same as the audio 
and video start times of the common system stream 
SSe, i.e., a gap-less connection is achieved. Note that 

35 jn this example the gap Td of the child-oriented system 
stream SSd is less than the gap Tc of the adult-oriented 
system stream SSc (Td < Tc). 

[0556] The one program chain PGC1, i.e., adult-ori- 
ented system stream SSc and common system stream 

*o SSe, is thus reproduced without a reproduction gap, but 
program chain PGC2, i.e., child-oriented system stream 
SSd and common system stream SSe, is reproduced 
with an audio reproduction gap of Tc - Td. Thus, even 
when connecting from plural reproduction paths (SSc 

45 and SSd) to one system stream (SSe), it is possible to 
eliminate any reproduction gap in the video or audio on 
at least one reproduction path. 
[0557] The third row in Fig. 43 shows the change in 
audio buffer storage during continuous reproduction of 

so program chain PGC2, i.e., child-oriented system stream 
SSd and common system stream SSe. The per-frame 
reproduction time of the audio stream in the audio buffer 
is indicated by the arrows. Note that system streams 
SSd and SSe are connected with an audio reproduction 

55 gap of Tc - Td, the difference between the reproduction 
time difference Tc of PGC1 and the reproduction time 
difference td of PGC2, at the connection. 
[0558] However, because DVD players normally syn- 
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chronize audio and video output referenced to the audio 
signal, the audio frames are output continuously. As a 
result, the audio reproduction gap Tc - Td is not repro- 
duced as a gap during reproduction, and audio repro- 
duction is therefore contiguous. 5 
[0559] The common system stream SSe is encoded 
so that the audio is reproduced, i.e., decoded, at a delay 
of Tc to the video. As a result, when the audio is repro- 
duced, i.e., decoded, so that there is no audio reproduc- 
tion gap Tc-Td, audio decoding is accomplished before 10 
the audio data is completely input to the audio buffer, 
and a data underflow state results in the audio buffer as 
shown by line Lu in Fig. 43. 

[0560] When the audio reproduction is contiguous 
and a reproduction gap is inserted between video 15 
frames, a data underflow state occurs in the video buffer 
due to video stream reproduction, similarly to video 
stream reproduction being interrupted as shown in Fig. 
41. 

[0561] As thus described, when plural difference sys- 20 
tern streams and one common system stream are con- 
nected, a difference occurs between the video repro- 
duction time and the audio reproduction time of the re- 
spective paths due to the offset in the audio and video 
frame reproduction times. 25 
[0562] The present invention therefore provides a re- 
cording method and apparatus and a reproduction 
method and apparatus whereby a video or audio buffer 
underflow state is prevented at the system stream con- 
nections, and seamless reproduction in which pauses 30 
in the video reproduction (freezes) or pauses in the au- 
dio reproduction (muting) do not occur. 
[0563] A method of connecting a single common sys- 
tem stream to the plural system streams contained in 
the multi-scene period of a title stream as shown in Fig. 35 
40 is described below according to the present inven- 
tion. The physical structure of the optical disk M, the 
overall data structure of the optical disk M, and the struc- 
tures of the DVD encoder ECD and DVD decoder DCD 
in this embodiment are as previously described with ref- *o 
erenceto Figs. 4 - 14, Figs. 1,16 - 20.22, Figs. 25,27 - 
29, and Figs. 26 above, and further description thereof 
is thus omitted below. 

[0564] There are two data transfer models under the 
MPEG standard: constant bit rate (CBR) whereby data 45 
is transferred continuously without interruptions, and 
variable bit rate (VBR) whereby data is transferred in- 
termittently with interruptions in the transfer. For simplic- 
ity, the present embodiment is described below using 
the CBR model only. so 
[0565] Referring first to Figs. 44, 45, and 46, a simple 
one-to-one system stream connection between first and 
second common system streams SSa and SSb is de- 
scribed first. For simplicity the following description is 
restricted to operation using one video stream SSav and 55 
one audio stream SSba. 

[0566] The system streams produced according to 
the present invention are shown in Fig. 44, the operation 



whereby these system streams are connected is shown 
in Fig. 45, and the method of generating the system 
streams is shown in Fig. 46. 

[0567] The structure of the tail of the leading common 
system stream SSa, and the head of the following com- 
mon system stream SSb, recorded to the optical disk M 
are shown in Fig. 44. 

[0568] In Fig. 44 are shown the structure of the end 
of the preceding common system stream SSa and the 
common system stream SSb following thereafter. Note 
that both system streams SSa and SSb are recorded to 
the optical disk M. 

[0569] The fifth row block Ge shows the structure of 
both system streams SSa and SSb. The first common 
system stream SSa comprises video stream SSav and 
audio stream SSaa; the second common system stream 
SSb similarly comprises video stream SSbv and audio 
stream SSba. 

[0570] The fourth row Gd shows the audio packet 
streams A of audio stream SSaa and audio stream SSba 
extracted from system stream SSa and system stream 
SSb. 

[0571] The third row Gc shows the data input/output 
state of the audio buffer 2800 when audio stream SSaa 
and audio stream SSba are input to the DVD decoder 
DCD shown in Fig. 26. 

[0572] The second row Gb shows the video packet 
streams V of video stream SSav and video stream SSbv 
extracted from system stream SSa and system stream 
SSb. 

[0573] The first row Ga shows the data input/output 
state of the video buffer 2600 when video stream SSav 
and video stream SSbv are input to the DVD decoder 
DCD shown in Fig. 26. 

[0574] Note that Ga, Gb, Gc, Gd, and Ge are all ref- 
erenced to the same time-base (direction T). 
[0575] Tvae in Fig. 44 is the input end time of the video 
stream SSav to the video buffer 2600, and Taae is the 
input end time of the audio stream SSaa to the audio 
buffer 2800. 

[0576] When system stream SSa is input to the DVD 
decoder DCD, the difference between the input end 
times Tvae and Taae of the video stream SSav and au- 
dio stream SSaa to the respective buffers 2600 and 
2800 is small, and is less than the reproduction time of 
two audio frames. As a result, the last audio packet A 
can be accumulated in the audio buffer 2800 before in- 
put of the audio and video streams in the next system 
stream starts. 

[0577] Likewise, when system stream SSb is input to 
the DVD decoder DCD, the difference between the input 
start times of the video stream SSbv and audio stream 
SSba to the respective buffers 2600 and 2800 is small, 
and is less than the reproduction time of two audio 
frames. 

[0578] The data input/output state to the video buffer 
2600 when system streams SSa and SSb (Fig. 44) 
stored to the optical disk M are connected and contigu- 
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ously reproduced. 

[0579] The top row in Fig. 45 shows the data input/ 
output state of the video buffer 2600 when video stream 
SSav and video stream SSbv are input continuously to 
the DVD decoder DCD. 5 
[0580] As in Fig. 39, Fig. 41 , and Fig. 44, the vertical 
axis Vdv indicates the accumulated video data volume 
Vdv in the video buffer 2600, and the horizontal axis in- 
dicates time T. Lines SSav and SSbv indicate the 
change in the video data volume Vdv accumulated in 10 
the video buffer 2600, and the slopes of the lines indi- 
cate the input rate to the video buffer 2600. Where the 
accumulated video data volume Vdv in the video buffer 
2600 drops indicates data consumption, i.e., that decod- 
ing has occurred. 15 
[0581] The second row shows the video packet 
streams in the video streams SSav and SSbv shown in 
Fig. 26. 

[0582] The third row shows the system streams SSa 
and SSb according to the present embodiment. Time T1 20 
is the input end time of the last video packet V1 in system 
stream SSa, time T2 indicates the input start time of the 
first v] V2 in system stream SSb, and time T indicates 
the decoding start time of system stream SSb. 
[0583] The difference between the input end times to 25 
the video buffer 2600 and audio buffer 2800 of the video 
stream SSav and the audio stream SSaa forming the 
system stream SSa of the present embodiment is re- 
duced by the system stream production method shown 
in Fig. 46. As a result, interference with the input of sys- 30 
tern stream SSb resulting from a succession of remain- 
ing audio packets A at the end of system stream SSa 
does not occur. The difference between the input end 
time T1 of the first video packet V1 of system stream 
SSa and the input start time T2 of the first video packet 35 
V2 in system stream SSb is small, there is sufficient time 
from the input start time T2 of video packet V2 to the 
first decode time Td of the video stream SSbv, and the 
video buffer 2600 therefore does not underflow at time 
Td. 40 
[0584] Unlike with the system stream shown in Fig. 
41 , the audio buffer 2800 therefore does not overflow at 
the end of the system stream, i.e., there is no interfer- 
ence with inputting the encoded video stream of the next 
system stream, when connecting and contiguously re- 45 
producing system streams SSa and SSb according to 
the present embodiment, and seamless reproduction 
can be achieved. 

[0585] A first method of producing a first common sys- 
tem stream SSa and a second common system stream 50 
SSb connected thereafter is described beiow with refer- 
ence to Fig. 46. Note that as in Fig. 44, the structure of 
the tail of the leading common system stream SSa, and 
the head of the following common system stream SSb, 
recorded to the optical disk M are shown in Fig. 46. 55 
[0586] The first row in Fig. 46 corresponds to block 
Ga in Fig. 44, and simulates the data input/output of vid- 
eo stream SSav and video stream SSbv to the video 



buffer 2600. Time T1 is the input end time of all data in 
the video stream SSav. 

[0587] The second row similarly corresponds to block 
Gb in Fig. 44, and shows the video data packet stream. 
[0588] The third row similarly corresponds to block Gc 
in Fig. 44, and simulates the data input/output of audio 
stream SSaa and audio stream SSba to the audio buffer 
2800. 

[0589] The fourth row similarly corresponds to block 
Gd in Fig. 44, and shows the audio data packet stream. 
[0590] The fifth row similarly corresponds to block Ge 
in Fig. 44, and shows the system stream resulting from 
interleaving and packing the video packets V shown in 
the second row and the audio packets A shown in the 
fourth row. The video packets and audio packets are in- 
terleaved in a FIFO manner from the video and audio 
buffers referenced to the packet input time to the respec- 
tive buffer. In other words, the packed data is multi- 
plexed referenced to the time the data contained in the 
pack is input to the video or audio buffer. 
[0591] The method of generating the first common 
system stream and the following second common sys- 
tem stream is described next. 

[0592] It is assumed below that the video bit rate is 8 
Mbps, the video buffer capacity is 224 KB, the audio 
buffer capacity is 4 KB, the audio data is encoded with 
Dolby AC-3 compression, and the compression bit rate 
is 384 Kbps. In AC-3 audio compression, the reproduc- 
tion time of one audio frame is 32 msec, corresponding 
to a data size of 1 536 bytes/frame, and two audio frames 
can therefore be stored in the audio buffer. 
[0593] Referenced to the input end time T1 of the vid- 
eo stream SSav to the video buffer 2600, the audio 
frame data following the current audio frame is moved 
to the audio stream SSba at time T1 to accumulate one 
audio frame in the audio buffer. This operation is de- 
scribed in detail below referring to the simulation results 
shown in row 3 of Fig. 46. 

[0594] Specifically, two audio frames (= 1536 bytes) 
from the encoded audio stream SSaa are accumulated 
in the audio buffer (4 KB capacity) at time T1, and the 
third to sixth audio frames following thereafter, indicated 
by frame Ma in Fig. 46, are moved to the beginning of 
the following encoded audio stream SSba. Note that the 
encoded audio stream is moved in audio frame units be- 
cause the audio frame is the unit of reproduction. 
[0595] Following the above operation, the encoded 
video stream SSav is packetized as shown in row 2 in 
Fig. 46, and the encoded audio stream SSaa is pack- 
etized as shown in row 4. As shown in row 5, the video 
packets V and audio packets A are then interleaved 
(multiplexed) to maintain an average distribution of au- 
dio packets to video packets in the FIFO sequence de- 
scribed above referenced to the packet input times to 
the buffers 2600 and 2800. After packing and converting 
to a system stream, the data is then recorded to the op- 
tical disk. 

[0596] In the same manner the encoded video stream 
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SSbv is packetized as shown in row 2 in Fig. 46, and 
the encoded audio stream SSba is packetized as shown 
in row 4. As shown in row 5, the video packets V and 
audio packets A are then interleaved (multiplexed) to 
maintain an average distribution of audio packets to vid- 
eo packets in the FIFO sequence described above ref- 
erenced to the packet input times to the buffers 2600 
and 2800. After packing and converting to a system 
stream, the data is then recorded to the optical disk. 
[0597] The resulting system streams SSa and SSb 
are thus structured as shown in Fig. 44, enabling seam- 
less reproduction by the DVD decoder DCD shown in 
Fig. 26. 

[0598] Because two audio frames can be accumulat- 
ed in the audio buffer, the last audio frame in the system 
stream SSa stored in the audio buffer at time T1 is trans- 
ferred as the last audio packet in system stream SSa 
during the two-frame reproduction time before decoding 
said last audio frame begins. The maximum input end 
time difference between the video packets and audio 
packets at the end of the system stream SSa is therefore 
the reproduction time of two audio frames. 
[0599] Furthermore, the audio buffer will not under- 
flow if the next audio data is input to the audio buffer 
before the presentation end time of the audio frames ac- 
cumulated in the audio buffer as of time T2. The input 
time of the first audio packet in system stream SSb is 
therefore at latest within the reproduction time of two 
audio frames after time T2 (= the presentation time of 
the accumulated audio frames + the reproduction time 
of one audio frame). Therefore the maximum input start 
time difference between the video packets and audio 
packets at the beginning of system stream SSb is the 
reproduction time of two audio frames. 
[0600] A second method of producing the system 
stream recorded to an optical disk according to the 
present embodiment is described next below with refer- 
ence to Fig. 47. The first, second, third, fourth, and fifth 
rows in Fig. 47 simulate the video and audio data input/ 
output states to the respective buffers referenced to the 
same time-base T as shown in Fig. 44. 
[0601] The first row in Fig. 47 corresponds to block 
Ga in Fig. 44, and simulates the data input/output of vid- 
eo stream SSav and video stream SSbv to the video 
buffer 2600. 

[0602] The second row similarly corresponds to block 
Gb in Fig. 44, and shows the video data packet stream. 
[0603] The third row similarly corresponds to block Gc 
in Fig. 44, and simulates the data input/output of audio 
stream SSaa and audio stream SSba to the audio buffer 
2800. 

[0604] The fourth row similarly corresponds to block 
Gd in Fig. 44, and shows the audio data packet stream. 
[0605] The fifth row similarly corresponds to block Ge 
in Fig. 44, and shows the system stream resulting from 
interleaving and packing the video packets V shown in 
the second row and the audio packets A shown in the 
fourth row. The video packets and audio packets are in- 



terleaved in a FIFO manner from the video and audio 
buffers referenced to the packet input time to the respec- 
tive buffer. In other words, the packed data is multi- 
plexed referenced to the time the data contained in the 
5 pack is input to the video or audio buffer. The first com- 
mon system stream SSa and the second common sys- 
tem stream SSb following thereafter can be produced 
using the first method described above with reference 
to Fig. 46. 

10 [0606] A different method for generating the first com- 
mon system stream SSa and the second common sys- 
tem stream SSb following thereafter, i.e., a method dif- 
ferent from that described with reference to Fig. 46, is 
described below with reference to Fig. 47. 

15 [0607] In the first method described above, part of the 
encoded audio stream from the preceding system 
stream is moved to the following system stream. This 
second method, however, is characterized by moving 
part of the encoded video and audio streams from the 

20 following system stream. This second method is partic- 
ularly effective when the preceding scene (system 
stream) is a scene from a multi-scene period, i.e., when 
moving from one of plural scenes (system streams) to 
the encoded system stream of a single scene is ex- 

25 tremely difficult. 

[0608] With this method the first GOP in video stream 
SSbv is moved to video stream SSav. The one GOP 
moved from video stream SSbv is connected to video 
stream SSav to assure time-base contiguity at the end. 

30 of video stream SSav. At the second GOP from the be- 
ginning of video stream SSbv, i.e., the second GOP 
counted from the beginning of video stream SSbv in- 
cluding the first GOP already moved, referenced to the 
input start time T2 of the data decoded first, the audio 

35 data of one audio frame is moved to the audio stream 
SSaa to accumulate one audio frame in the audio buffer. 
[0609] The one audio frame of data moved from audio 
stream SSba is then connected to audio stream SSaa 
to assure time-base contiguity at the end of audio 

40 stream SSaa. 

[061 0] The video data is moved in GOP units because 
the GOP is the unit of video data reproduction. Audio 
data is likewise moved in audio frame units because the 
audio frame is the unit of audio frame reproduction. 

45 [0611] Following the above operation, the encoded 
video stream SSav is packetized as shown in row 2 in 
Fig. 47, and the encoded audio stream SSaa is pack- 
etized as shown in row 4. As shown in row 5, the video 
packets V and audio packets A are then interleaved 

50 (multiplexed) to maintain an average distribution of au- 
dio packets to video packets in the FIFO sequence de- 
scribed above referenced to the packet input times to 
the buffers 2600 and 2800. After packing and converting 
to a system stream, the data is then recorded to the op- 

55 tical disk. 

[0612] In the same mannerthe encoded video stream 
SSbv is packetized as shown in row 2 in Fig. 47, and 
the encoded audio stream SSba is packetized as shown 
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In row 4. As shown in row 5, the video packets V and 
audio packets A are then interleaved (multiplexed) to 
maintain an average distribution of audio packets to vid- 
eo packets in the FIFO sequence described above ref- 
erenced to the packet input times to the buffers 2600 5 
and 2800. After packing and converting to a system 
stream, the data is then recorded to the optical disk. 
[0613] The resulting system streams SSa and SSb 
are thus structured as shown in Fig. 39, enabling seam- 
less reproduction by the DVD decoder DCD shown in 10 
Fig. 26. 

[0614] Because two audio frames can be accumulat- 
ed in the audio buffer, the last audio frame in the system 
stream SSa stored in the audio buffer at time T1 is trans- 
ferred as the last audio packet in system stream SSa 15 
during the two-frame reproduction time before decoding 
said last audio frame begins. The maximum input end 
time difference between the video packets and audio 
packets at the end of the system stream SSa is therefore 
the reproduction time of two audio frames. 20 
[0615] Furthermore, the audio buffer will not under- 
flow if the next audio data is input to the audio buffer 
before the presentation end time of the audio frames ac- 
cumulated in the audio buffer as of time T2. The input 
time of the first audio packet in system stream SSb is 25 
therefore at latest within the reproduction time of two 
audio frames after time T2 (= the presentation time of 
the accumulated audio frames + the reproduction time 
of one audio frame). Therefore the maximum input start 
time difference between the video packets and audio 30 
packets at the beginning of system stream SSb is the 
reproduction time of two audio frames. 
[0616] The next embodiment relates to connecting 
the system stream branches obtained by means of the 
system encoder according to the preferred embodiment 35 
of the present invention. 

[0617] The physical structure of the optical disk, the 
overall data structure of the optical disk, and the DVD 
decoder DCD in the present embodiment are as de- 
scribed above, and further description thereof is thus *o 
omitted below. 

[061 8] Note that the description of the present embod- 
iment below is limited to a single encoded video stream 
and a single encoded audio stream for simplicity. 
[0619] Fig. 48 shows the structure of the end of the 45 
second common system stream SSb, and the begin- 
nings of the two parental lock control system streams 
SSc and SSd that can be connected to the end of com- 
mon system stream SSb. Note that the common system 
stream SSb and one of the two parental lock control sys- so 
tern streams SSc and SSd are arrayed to the same time- 
base (horizontal time axis T) as shown in Fig. 46. 
[0620] System streams SSb, SSc, and SSd shown as 
separate blocks in Fig. 48 represent the following con- 
tent as in Fig. 46, 55 
[0621] The fifth row in each block shows the structure 
of system streams SSb, SSc, and SSd. system stream 
SSb comprises video stream SSbv and audio stream 



SSba; system stream SSc similarly comprises video 
stream SScv and audio stream SSca; and system 
stream SSd similarly comprises video stream SSdv and 
audio stream SSda. 

[0622] The fourth rows show the audio packet 
streams A of audio stream SSba, audio stream SSca 
and audio stream SSda extracted from system streams 
SSb, SSc, and SSd. 

[0623] The third rows show the data input/output state 
of the audio buffer 2800 when audio stream SSba, audio 
stream SSca and audio stream SSda are input to a DVD 
decoder DCD shown in Fig. 26. 
[0624] The second rows show the video packet 
streams V of video stream SSbv, video stream SScv, 
and video stream SSdv extracted from system streams 
SSb, SSc, and SSd. 

[0625] The first rows show the data input/output state 
of the video buffer 2600 when video stream SSbv, video 
stream SScv, and video stream SSdv are input to a DVD 
decoder DCD. 

[0626] The audio content of the first several audio 
frames in audio stream SSca and audio stream SSda at 
the beginning of system stream SSc and system stream 
SSd is the same. 

[0627] When system stream SSb is input to the DVD 
decoder DCD, the difference between the input end 
times of the video stream SSbv and audio stream SSba 
to the respective buffers 2600 and 2800 is small, and at 
most is less than the reproduction time of two audio 
frames. 

[0628] When system stream SSc is input to the DVD 
decoder DCD, the difference between the input end 
times of the video stream SScv and audio stream SSca 
to the respective buffers 2600 and 2800 is small, and at 
most is less than the reproduction time of two audio 
frames. 

[0629] When system stream SSd is input to the DVD 
decoder DCD, the difference between the input end 
times of the video stream SSdv and audio stream SSda 
to the respective buffers 2600 and 2800 is small, and at 
most is less than the reproduction time of two audio 
frames. 

[0630] The data input/output state of the video buffer 
2600 when system stream SSb is connected to and con- 
tiguously reproduced with system stream SSc or system 
stream SSd is the same as shown in Fig. 44. Specifical- 
ly, system stream SSa in Fig. 44 corresponds to system 
stream SSb in Fig. 48, and system stream SSb in Fig. 
44 corresponds to either system stream SSc or system 
stream SSd in Fig. 48. 

[0631] When system stream SSb and system stream 
SSd or system stream SSc in Fig. 48 are contiguously 
reproduced using the DVD decoder DCD shown in Fig. 
26, the video buffer also does not overflow as described 
above with reference to Fig. 44. As a result, seamless 
reproduction can be achieved when system stream SSb 
is connected and contiguously reproduced with system 
stream SSc or system stream SSd. 
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[0632] Note also that system streams SSb, SSc, and 
SSd are produced using the method described with ref- 
erence to Fig. 46. 

[0633] The data structure of system streams SSb, 
SSc, and SSd produced according to the method shown 
in Fig. 46 is shown in Fig. 48, and seamless reproduction 
can therefore be achieved using the DVD decoder DCD 
shown in Fig. 26. 

[0634] As described with reference to audio frame 
movement Fig. 46, the maximum input end time differ- 
ence between the video packets and audio packets at 
the end of the system stream SSb is at most the repro- 
duction time of two audio frames, and the maximum in- 
put start time difference between the video packets and 
audio packets at the beginning of system stream SSc 
or SSd is at most the reproduction time of two audio 
frames. 

[0635] When the audio frame moved from audio 
stream SSba is connected to destination audio streams 
SSca and SSda, an audio reproduction stop, i.e., an au- 
dio reproduction gap, is provided when making the con- 
nection. As a result, the differences in the video repro- 
duction time and the audio reproduction time of each 
reproduction path can be corrected based on the repro- 
duction gap information in the system streams not 
shared between different program chains PGC. As a re- 
sult, this video and audio reproduction time difference 
can be prevented from affecting the process connecting 
preceding and following system streams. 
[0636] Fig. 49 is used to describe the difference in the 
video reproduction time and audio reproduction time of 
different reproduction paths according to the present 
embodiment. In Fig. 49, time Tb represents the time dif- 
ference between the audio and video reproduction end 
times at the end of the system stream common to the 
adult-oriented title and the child-oriented title before 
moving the audio data; time Tc is the time difference be- 
tween the audio and video reproduction start times at 
the beginning of the adult-oriented title before audio da- 
ta movement; and time Td is the time difference be- 
tween the audio and video reproduction start times at 
the beginning of the child-oriented title before audio data 
movement. 

[0637] It is possible to match the time difference be- 
tween the audio and video reproduction start times of at 
least one of the plural different reproduction paths fol- 
lowing the branch to the time difference of the audio and 
video reproduction end times before the branch. Note 
that it is assumed in the following description that Tb = 
Tc, and Tb < Td. 

[0638] Because Tb = Tc in the adult-oriented title after 
the branch, the audio frame moved from the common 
part of the adult-oriented and child-oriented title streams 
can be connected to the beginning of the adult-oriented 
title without an audio reproduction gap. 
[0639] To enable seamless reproduction between 
system stream SSb and system stream SSc at the con- 
nection, the system streams are generated using the 



.first system stream encoding method described above 
with reference to moving audio data from one system 
stream SSb to another system stream SSc. 
[0640] The method of producing the system streams 

5 is the same as that described above with reference to 
Fig. 46 except that system streams SSa and SSb are 
replaced by system streams SSb and SSc in Fig. 49, 
and further description thereof is thus omitted below. 
[0641] Because Tb < Td in the child-oriented title after 

10 the branch, the audio frame moved from the common 
part of the adult-oriented and child-oriented title streams 
can be connected to the beginning of the child-oriented 
title stream with an audio reproduction gap of only Td - 
Tb. 

15 [0642] To enable seamless reproduction between 
system stream SSb and system stream SSd at the con- 
nection, the system streams are generated using the 
first system stream encoding method described above 
with reference to moving audio data from one system 

20 stream SSb to another system stream SSd. 

[0643] The method of producing the system streams 
is the same as that described above with reference to 
Fig. 46 except that system streams SSa and SSb are 
replaced by system streams SSb and SSd in Fig. 49, 

25 and further description thereof is thus omitted below. 
[0644] Note that packetizing in this case is controlled 
so that the audio frames before and after the audio re- 
production gap are not included in the same packet. As 
a result, it is possible to write the audio playback starting 

30 time information APTS (the audio frame reproduction 
start time including the audio reproduction pause time) 
of the audio frames before and afterthe audio reproduc- 
tion gap into the system stream. 
[0645] The packet containing the audio frame imme- 

35 diately preceding the audio reproduction gap is of ne- 
cessity small. During the packing process a padding 
packet is therefore used to produce a fixed-length pack 
of 2048 bytes/pack. 

[0646] The audio reproduction gap information for the 

40 audio reproduction gap of system stream SSd in this 
embodiment is inserted to the system stream by writing 
the audio frame reproduction end time immediately be- 
fore the audio reproduction gap of the child-oriented title 
to the audio reproduction stopping time 1 

45 (VOB_A_STP_PTM1) in the navigation pack NV (Fig. 
20), and writing the audio reproduction gap time Td - Tb 
to the audio reproduction stopping .period 1 
(VOB_A_GAP_LEN1) in the DSI packet. 
[0647] When there is no audio reproduction gap, it is 

so possible to determine that there is no audio reproduction 
gap by writing a 0 value to the audio reproduction stop- 
ping time 1 (VOB_A_STP_PTM1). 
[0648] By means of the above process, it is possible 
to set the time difference between the different audio 

55 and video reproduction times of different reproduction 
paths to the audio reproduction gap of the system 
streams not shared by different program chains PGC. 
[0649] In addition, by writing information relating to 
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the audio reproduction gap to the reproduction control 
information contained in the navigation packs NV, the 
audio reproduction gap and the information relating to 
the audio reproduction gap can all be contained within 
a single system stream. 

[0650] Furthermore, by containing the audio repro- 
duction gap and the information relating to the audio re- 
production gap within a single system stream, it is pos- 
sible to move the audio reproduction gap within the sys- 
tem stream. It is therefore possible to move the audio 
reproduction gap to a silent space or other place where 
least audibly perceptible, and thereby achieve a more 
seamless reproduction. 

[0651] The internal structure of the system encoder 
900 in the DVD encoder ECD shown in Fig. 25 is shown 
in detail in the block diagram in Fig. 50. Note that the 
system encoder 900 generates the system streams. 
[0652] As shown in Fig. 50, the system encoder 900 
comprises an elementary stream buffer 3301 for tempo- 
rarily storing the video, sub-picture, and audio data; a 
video analyzer 3302 for simulating the video buffer 
state; a sub-picture analyzer 3308 for simulating the 
sub-picture buffer state; an audio analyzer 3303 for sim- 
ulating the audio buffer state; a movement calculator 
3304 for calculating the number of audio frames to 
move; a packet producer 3305 for packetizing the video 
data, audio data, and sub-picture data; a multiplexer 
3306 for determining the packet arrangement; and a 
pack producer 3307 for packing the packets to produce 
the system stream. 

[0653] The elementary stream buffer 3301 is connect- 
ed to the video stream buffer 400, sub-picture stream 
buffer 600, and audio stream buffer 800 shown in Fig. 
26, and temporarily stores the elementary streams. The 
elementary stream buffer 3301 is also connected to the 
packet producer 3305. 

[0654] The video analyzer 3302 is connected to the 
video stream buffer 400, thus receives the encoded vid- 
eo stream St27, simulates the video buffer state, and 
supplies the simulation result to the movement calcula- 
tor 3304 and multiplexer 3306. 
[0655] The audio analyzer 3303 is likewise connected 
to the audio stream buffer 800, thus receives the encod- 
ed audio stream St31, simulates the audio buffer state, 
and supplies the simulation result to the movement cal- 
culator 3304 and multiplexer 3306. 
[0656] The sub-picture analyzer 3308 is likewise con- 
nected to the sub-picture stream buffer 600, thus re- 
ceives the encoded sub-picture stream St29, simulates 
the sub-picture buffer state, and supplies the simulation 
result to the movement calculator 3304 and multiplexer 
3306. 

[0657] Based on the simulated buffer states, the 
movement calculator 3304 calculates the audio move- 
ment (number of audio frames) and the audio reproduc- 
tion gap information, and supplies the calculation results 
to the packet producer 3305 and multiplexer 3306. More 
specifically, the movement calculator 3304 calculates 



the audio data movement MFApl from the preceding 
scene, the audio data movement MFAp2 to the preced- 
ing scene, the movement MGVp of 1 GOP of video data 
to the preceding scene, the movement MGVf of 1 GOP 
5 of video data from the following scene, the movement 
MFAfl of audio data to the following scene, and the 
movement MFAf2 of audio data from the following 
scene. 

[0658] The packet producer 3305 produces the video, 
10 sub-picture, and audio packets from the video data, sub- 
picture data, and audio data stored in the elementary 
stream buffer 3301 according to the audio movement 
calculated by the movement calculator 3304. The pack- 
et producer 3305 also produces the reproduction control 
15 information, i.e., the navigation packs NV. The audio re- 
production gap information is also written to the naviga- 
tion packs NV at this time. 

[0659] Based on the audio reproduction gap informa- 
tion and the video and audio buffer state information 

2Q simulated by the video analyzer 3302 and audio analyz- 
er 3303, the multiplexer 3306 rearranges, i.e., multiplex- 
es, the video packets, audio packets, and navigation 
packs NV. The movement calculator 3304 also performs 
based on the audio reproduction gap information. 

25 [0660] The pack producer 3307 then packs the pack- 
ets, adds the system header, and produces the system 
stream. 

[0661] Note that the operation of the system encoder 
900 is described in detail below with reference to Fig. 53. 

30 [0662] The present embodiment relates to connecting 
system streams by coupling. The next embodiment re- 
lates to connecting system streams at the trailing end of 
a multi-scene period, i.e., connecting one of plural pre- 
ceding system streams to the common system stream 

35 following thereafter. 

[0663] The physical structure of the optical disk, the 
overall data structure of the optical disk, and the DVD 
decoder DCD in the present embodiment are as de- 
scribed above, and further description thereof is thus 

40 omitted below. 

[0664] Note that the description of the present embod- 
iment below is limited to a single encoded video stream 
and a single encoded audio stream for simplicity. 
[0665] Fig. 51 shows the structure of the end of the 

45 two parental lock control system streams SSc and SSd, 
and the beginning of the following common system 
stream SSe that can be connected to either of the pre- 
ceding parental lock control system streams SSc and 
SSd. Note that this figure is basically the same as Fig. 

so 48 in which the parental lock control system streams are 
the following system streams. 

[0666] Note that the one of the two parental lock con- 
trol system streams SSc and SSd and the common sys- 
tem stream SSe are arrayed to the same time-base (hor- 
55 izontal time axis T) as shown in Fig. 51. 

[0667] System streams SSc, SSd, and SSe shown as 
separate blocks in Fig. 51 represent the following con- 
tent as in Fig. 46. 
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[0668] The fifth row in each block shows the structure 
of system streams SSc, SSd, and SSe. System stream 
SSc comprises video stream SScv and audio stream 
SSca; system stream SSd similarly comprises video 
stream SSdv and audio stream SSda; and system 
stream SSe comprises video stream SSev and audio 
stream SSea. 

[0669] The fourth rows show the audio packet 
streams A of audio stream SSca, audio stream SSda, 
and audio stream SSea, extracted from system streams 
SSc, SSd, and SSe. 

[0670] The third rows show the data input/output state 
of the audio buffer 2800 when audio stream SSca, audio 
stream SSda, and audio stream SSea, are input to the 
DVD decoder DCD. 

[0671] The second rows show the video packet 
streams V of video stream SScv, video stream SSdv, 
and video stream SSev extracted from system streams 
SSc, SSd, and SSe, 

[0672] The first rows show the data input/output state 
of the video buffer 2600 when video stream SScv, video 
stream SSdv, and video stream SSev are input to the 
DVD decoder DCD. 

[0673] At the end of system streams SSc and SSd, 
the video content of at least the last GOP in each video 
stream SSdv and SSev is the same. 
[0674] Likewise, the audio content of the last several 
audio frames in audio streams SSca and SSda at the 
end of system streams SSc and SSd is the same. 
[0675] When system stream SSc is input to the DVD 
decoder DCD, the difference between the input end 
times of the video stream SScv and audio stream SSca 
to the respective buffers 2600 and 2800 is small, and at 
most is less than the reproduction time of two audio 
frames. 

[0676] When system stream SSd is input to the DVD 
decoder DCD, the difference between the input end 
times of the video stream SSdv and audio stream SSda 
to the respective buffers 2600 and 2800 is small, and at 
most is less than the reproduction time of two audio 
frames. 

[0677] When system stream SSe is input to the DVD 
decoder DCD, the difference between the input end 
times of the video stream SSev and audio stream SSea 
to the respective buffers 2600 and 2800 is small, and at 
most is less than the reproduction time of two audio 
frames. 

[0678] The data input/output state of the video buffer 
2600 when system stream SSc or system stream SSd 
is connected to and contiguously reproduced with sys- 
tem stream SSe is the same as shown in Fig. 44. Spe- 
cifically, system stream SSa in Fig. 44 corresponds to 
either system stream SSc or system stream SSd in Fig. 
51, and system stream SSb in Fig. 44 corresponds to 
system stream SSe in Fig. 51. 
[0679] Seamless reproduction can thus be achieved 
when system stream SSc or system stream SSd is con- 
nected and contiguously reproduced with system 
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stream SSe. 

[0680] Note also that system streams SSc, SSd, and 
SSe are produced using the second method described 
above with reference to Fig. 47. More specifically, the 

5 system streams can be similarly created by substituting 
system streams SSc and SSd in Fig. 51 for system 
stream SSa in Fig. 47, and substituting system stream 
SSe in Fig. 51 for system stream SSb in Fig. 47. The 
method of generating the system streams is as de- 

10 scribed above with reference to Fig. 47. 

[0681] The data structure of system streams SSc, 
SSd, and SSe produced according to the method shown 
in Fig. 47 is shown in Fig. 51 , and seamless reproduction 
can therefore be achieved using the DVD decoder DCD 

15 shown in Fig. 26. 

[0682] As described with reference to audio frame 
movement Fig. 46, the maximum input end time differ- 
ence between the video packets and audio packets at 
the end of system streams SSc and SSd is at most the 

20 reproduction time of two audio frames, and the maxi- 
mum input start time difference between the video pack- 
ets and audio packets at the beginning of system stream 
SSe is at most the reproduction time of two audio 
frames. 

25 [0683] By providing an audio reproduction stop, i.e., 
an audio reproduction gap, when moving and connect- 
ing audio frames from audio stream SSea to destination 
audio streams SSca and SSda, the differences in the 
video reproduction time and the audio reproduction time 

30 of each reproduction path can be contained within the 
system streams not shared between different program 
chains PGC. 

[0684] Fig. 52 is used to describe the difference in the 
video reproduction time and audio reproduction time of 

35 different reproduction paths according to the present 
embodiment. In Fig. 52, time Te represents the time dif- 
ference between the audio and video reproduction start 
times at the beginning of the adult-oriented title before 
audio data movement; time Tc' is the time difference be- 

40 tween the audio and video reproduction end time at the 
end of the adult-oriented title stream; and time Td' is the 
time difference between the audio and video reproduc- 
tion end times at the end of the child-oriented title stream 
before moving the audio data. 

45 [0685] It is possible to match the time difference be- 
tween the audio and video reproduction end times of at 
least one of the plural different reproduction paths be- 
fore the connection with the time difference of the audio 
and video reproduction start times following the connec- 

so tion. Note that it is assumed in the following description 
thatTe = Tc\ andTe<Td\ 

[0686] Because Te = Tc' in the adult-oriented title be- 
fore the connection, the audio frame moved from the 
common part of the adult-oriented and child-oriented ti- 
55 tie streams can be connected to the end of the child- 
oriented title stream without an audio reproduction gap. 
A seamless stream is then produced after the connec- 
tion as shown in the figure. 
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[0687] Because Td* < Te in the child-oriented title 
stream before the connection, the audio frame moved 
from the common part of the adult-oriented and child- 
oriented title streams can be connected to the end of the 
child-oriented title stream with an audio reproduction 
gap of only Te - Td'. 

[0688] To enable seamless reproduction between 
system stream SSc and system stream SSd at the con- 
nection with system stream SSe, the system streams 
are generated using the second system stream encod- 
ing method described above with reference to moving 
the encoded video stream and audio data from one sys- 
tem stream SSe to another system stream SSc and 
SSd. 

[0689] The method of producing the system streams 
is the same as that described above with reference to 
Fig. 47 except that system streams SSc and SSd in Fig. 
51 are substituted for system stream SSa in Fig. 47, and 
system stream SSe in Fig. 51 is substituted for system 
stream SSb in Fig. 47, and further description thereof is 
thus omitted below. 

[0690] When producing these system streams, the 
packets are generated so that the audio frames before 
and after the audio reproduction gap are not contained 
in the same packet. As a result, it is possible to write the 
audio playback starting time information APTS (the au- 
dio frame reproduction start time including the audio re- 
production pause time) of the audio frames before and 
after the audio reproduction gap into the system stream. 
[0691] The packet containing the audio frame imme- 
diately preceding the audio reproduction gap is of ne- 
cessity small. During the packing process a padding 
packet is therefore used to produce a fixed-length pack 
of 2048 bytes/pack. 

[0692] The audio reproduction gap information for the 
audio reproduction gap of system stream SSd in this 
embodiment is inserted to the system stream by writing 
the audio frame reproduction end time immediately be- 
fore the audio reproduction gap of the child-oriented title 
to the audio reproduction stopping time 2 
(VOB_A_STP_PTM2) in the navigation pack NV (Fig. 
20), and writing the audio reproduction gap time Te - Td' 
to the audio reproduction stopping period 2 
(VOB_A_GAP_LEN2) in the DSI packet. 
[0693] When there is no audio reproduction gap, it is 
possible to determine that there is no audio reproduction 
gap by writing a 0 value to the audio reproduction stop- 
ping time 2 (VOB_A_STP_PTM2). 
[0694] By means of the above process, it is possible 
to set the time difference between the different audio 
and video reproduction times of different reproduction 
paths to the audio reproduction gap of the system 
streams not shared by different program chains PGC. 
[0695] In addition, by writing information relating to 
the audio reproduction gap to the reproduction control 
information contained in the navigation packs NV, the 
audio reproduction gap and the information relating to 
the audio reproduction gap can all be contained within 



a single system stream. 

[0696] Furthermore, by containing the audio repro- 
duction gap and the information relating to the audio re- 
. production gap within a single system stream, it is pos- 

5 sible to move the audio reproduction gap within the sys- 
tem stream. It is therefore possible to move the audio 
reproduction gap to a silent space or other place where 
least audibly perceptible, achieve seamless data repro- 
duction not permitting the audio buffer to underflow, and 

10 thereby achieve seamless reproduction of the audio in- 
formation that is important for human perception of data 
contiguity. 

[0697] The system streams described above can be 

produced using the system encoder 900 of the DVD en- 
15 coder ECD shown in Fig. 25. The structure of the system 

encoder900 is as described above with reference to Fig. 

50, and further description thereof is thus omitted below. 

[0698] The process of producing the above described 

system streams is described below with reference to 
2Q Fig. 53. Note that this process is the system encoding 

subroutine shown as step #2200 of the system encoder 

flow chart shown in Fig. 34. 

System encoder flow chart 

25 

[0699] The system encoding process is described be- 
low with reference to Fig. 53. 

[0700] At step #307002 the conditions for connecting 
with the preceding scene are evaluated based on the 
30 state of the preceding VOB seamless connection flag 
VOB_Fsb. If a non-seamless connection with the pre- 
ceding scene is specified, i.e., VOB_Fsb_ 1, the proce- 
dure moves to step #307010. 

[0701] At step #307010 the movement calculator 

35 3304 (Fig. 50) sets the audio data movement MFApl 
from the preceding scene, i.e., the number of audio 
frames moved, to 0 based on the VOB_Fsb_1 declara- 
tion. The procedure then moves to step #307014. 
[0702] If a seamless connection with the preceding 

40 scene is specified, i.e., VOB_Fsb = 1 , at step #307002, 
the procedure moves to step #307004. 
[0703] At step #307004 it is determined whether the 
preceding scene is in a multi-scene period. If it is not, 
the procedure moves to step #307012; if it is, the pro- 

45 cedure moves to step #307006. 

[0704] At step #307012 the audio data movement 
MFApl from the preceding scene is calculated, and the 
procedure moves to step #307014. Note that the meth- 
od of calculating the audio data movement MFApl is de- 

50 scribed after this process with reference to Fig. 54 be- 
low. 

[0705] At step #307006 the movement MGVp of 1 
GOP of video data to the preceding scene is calculated, 
and the procedure moves to step #307008. If the pre- 
55 ceding scene is in a multi-scene period, it is not possible 
to uniformly calculate the audio data movement MFApl 
as in step #307012. As a result, the movement of one 
GOP of video data from the beginning of the present 
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scene to the preceding scene is calculated. 
[0706] At step #307008 the audio data movement 
MFAp2 to the following scene is calculated, and the pro- 
cedure moves to step #307014. Note that the method 
of calculating the audio data movement MFAp2 is de- 5 
scribed after this process with reference to Fig. 55 be- 
low. 

[0707] At step #30701 4 the conditions for connecting 
with the following scene are evaluated based on the 
state of the following VOB seamless connection flag 10 
VOB.Fsf . If a non-seamless connection with the follow- 
ing scene is specified, i.e., VOB_Fsf_1, the procedure 
moves to step #307022. If a seamless connection with 
the following scene is specified, i.e., VOB_Fsf = 1, the 
procedure moves to step #30701 6. 15 
[0708] At step #307022 the movement calculator 
3304 (Fig. 50) sets the audio data movement MFApl to 
the following scene to 0 based on the VOB_Fsb_ 1 dec- 
laration. The procedure then moves to step #307026. 
[0709] At step #30701 6 it is determined whether the 20 
following scene is in a multi-scene period based on the 
multi-scene flag VOB.Fp. If it is not, i.e., VOB_Fp_1, 
the procedure moves to step #307024; if it is, i.e., 
VOB_Fp = 1 , the procedure moves to step #307018. 
[0710] At step #307024 the audio data movement 25 
MFApl , MFAp2 to the following scene is calculated, and 
the procedure moves to step #307026. Note that the 
method of calculating the audio data movement MFAp2 
is the same as that used in step #307012. 
[0711] At step #307018 the movement MGVf of 1 30 
GOP of video data from the following scene is calculat- 
ed, and the procedure moves to step #307020. 
[0712] At step #307020 the audio data movement 
MFAp2 from the following scene is calculated, and the 
procedure moves to step #307026. Note that the meth- 35 
od of calculating the audio data movement MFAf2 is the 
same as that used in step #307008. 
[0713] At step #307026 the audio reproduction stop- 
ping time 1 (VOB_A_STP_PTM1) and the audio repro- 
duction stopping period 1 (VOB_A_GAP_LEN 1 ) are cal- *o 
culated from the audio and video end times of the pre- 
ceding scene. The procedure then moves to step 
#307028. 

[0714] At step #307028 the audio reproduction stop- 
ping time 2 (VOB_A_STP_PTM2) and the audio repro- 45 
duction stopping period 2 (VOB_A_GAPJ_EN2) are cal- 
culated from the audio and video start times in the fol- 
lowing scene. The procedure then moves to step 
#307030. 

[0715] At step #307030 the audio data, including the 50 
audio movement, is packetized, and the procedure 
moves to step #307032. 

[0716] At step #307032 the video data, including the 
video movement, is packetized, and the procedure 
moves to step #307034. 55 
[0717] At step #307034 the navigation pack NV is 
generated, the audio reproduction stopping time 1 
(VOB_A_STP_PTM1) and the audio reproduction stop- 



ping period 1 (VOB_A_GAP_LEN1), and the audio re- 
production stopping time 2 (VOB_A_STP_PTM2) and 
the audio reproduction stopping period 2 
(VOB_A_GAP_LEN2) are recorded, and the procedure 
moves to step #307036. 

[0718] At step #307036 the video packets V, audio 
packets A, and navigation pack NV are multiplexed. 
[0719] As described above, it is thus possible to move 
audio and video data between scenes according to the 
conditions for connections with the preceding and fol- 
lowing scenes, and generate the system stream accord- 
ingly. 

[0720] The method of calculating the audio data 
movement MFApl in step #307012 above is described 
below with reference to Fig. 54. 
[0721] In Fig. 54 video 1 is the video data at the end 
of the preceding scene, with the video 1 line represent- 
ing the change in video data accumulation at the end of 
the preceding scene in the video buffer 2600 of the DVD 
decoder DCD; video 2 is similarly the video data at the 
beginning of said scene with the video 2 line represent- 
ing the change in said video data accumulation in the 
video buffer 2600 at the beginning of said scene. 
[0722] Note that both video 1 and video 2 represent 
the state of the video buffer before system stream con- 
nection. VDTS is the time video 2 is first decoded; tv is 
the video 2 transfer start time, and is calculated from 
equation 30631 below where video buffer verifier delay 
vbv_delay is defined as the time from the start of data 
input to the video buffer to the start of decoding. If de- 
coding starts at vbv_delay after the start of data input to 
the video buffer, a video buffer data underflow state can 
be reliably prevented during the following decoding 
process. 

tv = VDTS - vbv_delay [1] 

[0723] Audio 1 shows the transfer of audio frames at 
the end of the preceding scene to the audio buffer where 
af1 , af2, af3, and af4 are the audio frames contained in 
audio 1 . Note that the audio frame is the encoding proc- 
ess unit, and contains the audio data for a defined period 
of time (Af). 

[0724] Audio 2 shows the transfer of audio frames at 
the beginning of the scene to the audio buffer where af5 
and af6 are the audio frames contained in audio 2. 
[0725] APTS is the time the audio in audio 2 is first 
reproduced. 

[0726] The audio frames (af3, af4) transferred during 
period APTS from time tv, i.e., the number of audio 
frames (Amove) MFApl attached to audio 1 transferred 
after the start of video 2 transfer, is calculated according 
to equation 2. 

Amove = (APTS - tv - Af) / Af [2] 
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[0727] The audio data movement (number of audio 
frames) from the preceding scene is thus calculated. 
[0728] The method of calculating the audio data 
movement MFAp2 to the following scene in step 
#307008 above is described below with reference to Fig. 5 
55. 

[0729] As in Fig. 54, video 1 is the video data at the 
end of the preceding scene, and video 2 is similarly the 
video data at the beginning of said scene. Note that both 
video 1 and video 2 show the video buffer state before 10 
scene connection. VDTS is the time video 2 is first de- 
coded; GOP_move is the one GOP video data GMVp 
moved in step #307006; tv is the time video 2 transfer 
starts after moving the GOP_move quantity of GOP, and 
can be uniformly calculated. 15 
[0730] Audio 1 shows the transfer of audio frames at 
the end of the preceding scene to the audio buffer where 
af1 , af2, af3, and af4 are the audio frames contained in 
audio 1 . Note that the audio frame is the encoding proc- 
ess unit, and contains the audio data for a defined period 20 
of time (Af). 

[0731] Audio 2 shows the transfer of audio frames at 
the beginning of the scene to the audio bufferwhere af5, 
af6, and af7 are the audio frames contained in audio 2. 
[0732] APTS is the time the audio in audio 2 is first 25 
reproduced. 

[0733] The audio frames (af5, af6, af7) transferred 
during period APTS from time tv, i.e., the number of au- 
dio frames (Amove) MFApl attached to audio 2 trans- 
ferred before the start of video 2 transfer after moving 30 
GOP_move quantity of GOP, is calculated according to 
equation 3. 

Amove = (APTS -tv + 2Af)/Af [3] 35 

[0734] The audio data movement (number of audio 
frames) to the preceding scene is thus calculated. 

Audio gap reproduction processing *o 

[0735] While the basic structure of the DVD decoder 
DCD used in the present embodiment is as shown in 
Fig. 26, the synchronizer 2900 is structured as shown 
in Fig. 56 to process the audio reproduction gap. 45 
[0736] As shown in Fig. 56, a block diagram of the 
synchronizer 2900 shown in Fig. 26, the synchronizer 
2900 comprises an STC generator 2950, audio decoder 
controller 2952, and audio decoder control data buffer 
2954. so 
[0737] The STC generator 2950 generates the sys- 
tem clock STC used as the reference clock for decoding 
control based on the system clock reference SCR value 
set by the decoding system controller 2300. 
[0738] The audio decoder controller 2952 controls the 55 
decoding start and stop of the audio decoder 3200 
based on the STC value from the STC generator 2950 
and the control information from the audio decoder con- 



trol data buffer 2954. 

[0739] The audio decoder control data buffer 2954 
stores the values of the audio decoding control informa- 
tion (such as VOB_A_STP_PTM and 
VOB_A_GAP_LEN) set by the decoding system con- 
troller 2300. 

[0740] The operation of the synchronizer 2900 thus 
comprised according to the present embodiment is de- 
scribed below with reference to Fig. 26 and Fig. 56. 
[0741] The overall operation of the DVD decoder DCD 
in Fig. 26 is as previously described, and further descrip- 
tion thereof is thus omitted below. The operation related 
to the specific processes of the present embodiment is 
described below. 

[0742] Referring to Fig. 26, the decoding system con- 
troller 2300 reads the audio reproduction stopping time 
1 (VOB_A_STP_PTM1), the audio reproduction stop- 
ping period 1 (VOB_A_GAP_LEN1), the audio repro- 
duction stopping time 2 (VOB_A_STP_PTM2), and the 
audio reproduction stopping period 2 
(VOB_A_GAP_LEN2) from the DSI packet in the navi- 
gation pack NV, and stores these four values as the au- 
dio decode reproduction stopping information to the au- 
dio decoder control data buffer 2954 of the synchronizer 
2900. 

[0743] When the time supplied from the STC genera- 
tor 2950 matches the audio reproduction stopping time 
1 (VOB_A_STP_PTM1) stored in the audio decoder 
control data buffer 2954, the audio decoder controller 
2952 stops the audio decoder 3200 for the audio repro- 
duction stopping period 1 (VOB_A_GAP_LEN1) stored 
in the audio decoder control data buffer 2954. Likewise 
when the time supplied from the STC generator 2950 
matches the audio reproduction stopping time 2 
(VOB_ASTP_PTM2) stored in the audio decoder con- 
trol data buffer 2954, the audio decoder controller 2952 
stops the audio decoder 3200 for the audio reproduction 
stopping period 2 (VOB_A_GAP_LEN2) stored in the 
audio decoder control data buffer 2954. 
[0744] By thus comprising a STC generator 2950 and 
audio decoder controller 2952, the synchronizer 2900 is 
able to process audio reproduction gaps contained in 
the system stream of a multi-scene period when con- 
necting a stream from a multi-scene period with a com- 
mon scene stream. 

[0745] Note that an audio reproduction gap may occur 
in the present invention in one or both of VOB 6 and 
VOB 7 corresponding to scenes 6 and 7 in a parental 
lock control scene period as shown in Fig. 21. 
[0746] The decoding process executed by the decod- 
ing system controller 2300 of the present invention is 
described briefly below with reference to Fig. 60, Fig. 
61, Fig. 62, Fig. 63, and Fig. 64. The process executed 
by the audio decoder controller 2952 of the present in- 
vention is then described with reference to Fig. 57. 
[0747] In Fig. 60 the title selected by the user is ex- 
tracted from the multimedia bitstream MBS stored to the 
digital video disk, and the VTS_PGCI #i program chain 
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(PGC) data for reproducing the selected title is extracted 
by the decoding system controller 2300 of the DVD de- 
coder DCD at step #310214. Then at step #310216 the 
selected title is reproduced based on the extracted 
VTS_PGCI #i program chain (PGC) information. The 5 
process shown in Fig. 60 has already been described 
in detail above, and further description thereof is thus 
omitted below. 

[0748] The process of reproducing the VTS_PGCI #i 
program chain in step #31021 6, Fig. 60, is shown in Fig. 10 
61 and described below. 

[0749] At step #31030 the decoding system table 
shown in Fig. 58 is set. The transfer process to the 
stream buffer 2400 (step #31032), and the data decod- 
ing process in the stream buffer 2400 (step #31034) are 15 
executed in parallel. Note that the process of step 
#31032 is based on the cell reproduction information in 
the PGC information entries C_PBI #j. The process 
shown in Fig. 61 has already been described in detail 
above, arid further description thereof is thus omitted 20 
below. 

[0750] The stream buffer data transfer executed for 
each cell reproduction information entry (PGC informa- 
tion entries C_PBI #j) by the process of step #31032 is 
described in further detail below referring to Fig. 62. Be- 25 
cause a parental lock control scene is being processed 
in the present embodiment, step #31040 of Fig. 62 re- 
turns NO, and the procedure moves to step#30144. The 
process shown in Fig. 62 has already been described 
in detail above, and further description thereof is thus 30 
omitted below. 

[0751] The non-multi-angle cell decoding process, i. 
e., the parental lock control cell decoding process exe- 
cuted as step #3 1044, Fig. 62, is described further below 
with reference to Fig. 63. Step #31 050 evaluates the in- 35 
terleaved allocation flag IAF_reg to determine whether 
the cell is in an interleaved block. Because the seamless 
connection, parental lock control title processed by the 
present embodiment is arrayed to an interleaved block, 
step #31050 routes control to step #31 052. The process *o 
shown in Fig. 63 has already been described in detail 
above, and further description thereof is thus omitted 
below. 

[0752] The non-multi-angle interleaved block process 
(step #31052, Fig. 63) is described further below with 45 
reference to Fig. 64. At step #31 062 the audio reproduc- 
tion stopping time 1 (VOB_A_STP_PTM1), the audio re- 
production stopping period 1 (VOB_A_GAPJ_EN1), the 
audio reproduction stopping time 2 
(VOB_A_STP_PTM2), and the audio reproduction stop- so 
ping period 2 (VOB_A_GAP_LEN2) are extracted as the 
table data from the DSI packet in the navigation pack 
NV(Fig. 20) and stored to the audio decoder control da- 
ta buffer 2954 (Fig. 56). The procedure then moves to 
step #31064 whereby VOB data transfer is continued 55 
until it is determined at step #31066 that all interleave 
units in the interleaved block have been transferred. 
[0753] The process executed by the audio decoder 



68 A2 98 

controller 2952 in Fig. 56 is described next with refer- 
ence to Fig. 57. 

[0754] At step #202301 the audio decoder controller 
2952 reads the audio reproduction stopping time 1 
(VOB_A_STP_PTM1) from the audio decoder control 
data buffer 2954, and compares VOB_A_STP_PTM1 
with the system clock STC from the STC generator 
2950. If the values match, i.e., a YES is returned, the 
procedure moves to step #202302; if the values do not 
match, i.e., a NO is returned, the procedure moves to 
step #202303. 

[0755] At step #202302 the audio reproduction stop- 
ping period 1 (VOB_A_GAP_LEN1) is read from the au- 
dio decoder control data buffer 2954, and the audio de- 
coder 3200 is stopped for this period. 
[0756] At step #202303, the audio decoder controller 
2952 reads the audio reproduction stopping time 2 
(VOB_A_STP_PTM2) from the audio decoder control 
data buffer 2954, and compares VOB_A_STP_PTM2 
with the system clock STC from the STC generator 
2950. If the values match, i.e., a YES is returned, the 
procedure moves to step #202304; if the values do not 
match, i.e., a NO is returned, the procedure returns to 
step #202301. 

[0757] At step #202304 the audio reproduction stop- 
ping period 2 (VOB_A_GAP_LEN2) is read from the au- 
dio decoder control data buffer 2954, and the audio de- 
coder 3200 is stopped for this period. 
[0758] The audio reproduction stopping time informa- 
tion (VOB_A_STP_PTM and VOB_A_GAP_LEN) is 
thus written to the DSI packet of the navigation pack NV 
in the system stream. Based on this audio reproduction 
stopping time information, the DVD decoder DCD com- 
prising an audio decoder control data buffer 2954 and 
an audio decoder controller 2952 for controlling the au- 
dio stream decoding operation is able to process audio 
reproduction gaps found in parental lock control scenes, 
i.e., in system streams shared by plural different pro- 
gram chains as shown in Fig. 30. It is therefore able to 
prevent intermittent video reproduction (video freezing) 
and intermittent audio reproduction (muting) caused by 
a data underflow state in the video buffer or audio buffer 
resulting when one common system stream is connect- 
ed to one of plural system streams branching from (fol- 
lowing) or to (preceding) the one system stream. 
[0759] Note that while audio data is moved in audio 
frame units in the above embodiment, the same effect 
can be achieved if the audio frames are broken into 
smaller units used as the movement unit to connect and 
contiguously reproduce system streams. 
[0760] Furthermore, while video data is moved in 
GOP units according to the second system stream pro- 
duction method in the above embodiment, the same ef- 
fect can be achieved if the GOP units are broken into 
smaller units used as the movement unit to connect and 
contiguously reproduce system streams. 
[0761] Furthermore, while only audio data is moved 
according to the first system stream production method 



50 



99 



EP 1 202 568 A2 



100 



in the above embodiment, the same effect can be 
achieved if video data is also moved from the system 
stream preceding the connection to the system stream 
following the connection. 

[0762] The present embodiment has also been de- 5 
scribed with reference to only one video stream and one 
audio stream, but the invention shall not be so limited. 
[0763] While the present embodiment has been de- 
scribed with particular reference to branching and con- 
necting streams as used to implement a parental lock 10 
control feature, seamless contiguous reproduction can 
also be achieved in multi-angle scene periods in which 
the plural video streams provide different perspectives 
(views) of the same title content, and using multimedia 
optical disks to which system streams configured as de- 15 
scribed above are recorded. 

[0764] The second system stream production method 
described above is described as being used at connec- 
tions from one of plural system streams to a single com- 
mon system stream in the present embodiment. How- 20 
ever, the same effect can be achieved using the first sys- 
tem stream production method described above when 
the same audio information is recorded to system 
streams not shared between different program chains. 
[0765] The present embodiment was also described 25 
using a digital video disk DVD, but the same effect can 
be achieved using other optical disks recording system 
streams having the same data structure as that of the 
present embodiment described above. 
[0766] With the audio and video data interleaving 30 
method of the present embodiment the audio data input 
by the decoding time includes only the data used in the 
next audio decode operation and any remainder from 
the packet transfer operation (approximately 2 KB). 
However, insofar as an audio buffer underflow state 35 
does not occur, i.e., insofar as the interleaving method 
interleaves the audio and video data to transfer audio 
data in a quantity and frequency preventing an audio 
buffer underflow state, the same effect can be achieved. 
[0767] Information relating to the audio reproduction 40 
gap at a system stream branch is written to the audio 
reproduction stopping time 1 (VOB__A_STP_PTM1) and 
the audio reproduction stopping period 1 
(VOB_A_GAP_LEN1) fields of the navigation pack NV 
in the present embodiment, but this audio reproduction 45 
gap information may be written to the audio reproduction 
stopping time 2 (VOB_A_STP_PTM2) and the audio re- 
production stopping period 2 (VOB_A_GAP_LEN2) 
fields. 

[0768] Information relating to the audio reproduction so 
gap at a system stream connection is written to the audio 
reproduction stopping time 2 (VOB_A_STP_PTM2) and 
the audio reproduction stopping period 2 
(VOB_A_GAP_LEN2) fields of the navigation pack NV 
in the present embodiment, but this audio reproduction 55 
gap information may be written to the audio reproduction 
stopping time 1 (VOB_A_STP_PTM1) and the audio re- 
production stopping period 1 (VOB_A_GAP_LEN1) 



fields. 

[0769] The difference between the input end times to 
the respective buffers of the audio and video data in the 
system stream is defined as at most the reproduction 
time of two audio frames in this embodiment. However, 
if the video is encoded with variable bit rate (VBR) cod- 
ing and the video bit rate before the connection is re- 
duced, the same effect can be achieved even when the 
input start time of the video data to the video buffer is 
advanced. 

[0770] The difference between the input start times to 
the respective buffers of the audio and video data in the 
system stream is defined as at most the reproduction 
time of two audio frames in this embodiment. However, 
if the video is encoded with variable bit rate (VBR) cod- 
ing and the video bit rate before the connection is re- 
duced, the same effect can be achieved even when the 
input end time of the video data to the video buffer is 
delayed. 

[0771] The present embodiment is also described as 
accumulating one audio frame in the audio buffer when 
system streams are connected, but the present inven- 
tion shall not be so limited and the same effects can be 
achieved if a different audio buffer accumulation level is 
used insofar as an audio buffer overflow state is not in- 
duced. 

[0772] Furthermore, while video data is moved in 
GOP units in the above embodiment, if the video data 
input bit rates differ in the connected system streams, 
the same effect can be achieved by encoding the GOP 
to be moved at the input bit rate of the video data in the 
system stream to which the GOP is moved. 
[0773] The compressed audio and video streams are 
also used for data movement in the above embodiment, 
but the same effect can be achieved by first moving the 
data at the pre-encoded material level. 
[0774] Only one GOP is also moved in the above em- 
bodiment, but the same effect can be achieved by mov- 
ing two or more, i.e., plural, GOP. 
[0775] It is therefore possible by means of the present 
invention thus described to reproduce system streams 
from different program chains as a single contiguous ti- 
tle without intermittent video presentation (freezing) or 
intermittent audio presentation (muting) when connect- 
ing and contiguously reproducing plural system streams 
from a multimedia optical disk recorded with the video 
packets and audio packets interleaved to a single sys- 
tem stream meeting the following conditions: 

(a) the difference between the input start time of the 
first video packet and the input start time of the first 
audio packet at the beginning of the system stream 
is less than the reproduction time of the number of 
audio frames that can be stored in the audio buffer 
plus one audio frame, and 

(b) the difference between the input end time of the 
last video packet and the input end time of the last 
audio packet at the end of the system stream is less 
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than the reproduction time of the number of audio 
frames that can be stored in the audio buffer plus 
one audio frame. 

[0776] Using a multimedia optical disk recorded with 5 
a system stream containing plural scenario branches, i. 
e., plural system streams branching from a single sys- 
tem stream to which said plural system streams may 
connect, where at least the same audio content is re- 
corded to one or more audio frames at the beginning of 
each of the plural system streams connecting to said 
single system stream, it is particularly possible to repro- 
duce plural scenario titles as single natural titles without 
stopping the video presentation (video freezing) at the 
system stream connection when connecting and contig- 
uously reproducing the system streams. 
[0777] Using a multimedia optical disk.recorded with 
a system stream containing plural scenario connec- 
tions, i.e., plural system streams connecting to a single 
system stream following thereafter, where at least the 
same video content is recorded to one or more video 
frames at the beginning of each of the plural system 
streams connecting to said single system stream end of 
each of the plural system streams connecting to said 
single system stream or beginning of the single system 
stream connecting to the plural system streams, it is par- 
ticularly possible to reproduce plural scenario titles as 
single natural titles without stopping the video presen- 
tation (video freezing) at the system stream connection 
when connecting and contiguously reproducing the sys- 
tem streams. 

[0778] Video or audio buffer underflow states at sys- 
tem stream connections, i.e., intermittent video presen- 
tation (video freezing) or intermittent audio presentation 
(audio muting), resulting from the time difference in the 
video and audio reproduction times of different repro- 
duction paths can also be prevented by means of a DVD 
reproduction apparatus wherewith audio reproduction 
gap information is recorded to the reproduction control 
information, and said audio reproduction gap informa- 
tion is used by an audio decoder controller to start and 
stop audio decoder operation appropriately. 
[0779] By inserting a time difference in the video and 
audio reproduction times of different reproduction paths 
as an audio reproduction gap In one system stream not 
shared by different program chains, problems created 
by system stream connections, i.e., across system 
streams, can be converted to a problem contained with- 
in a single system stream. It is therefore possible to con- 
tain the audio reproduction gap information within the 
DSI packet of the system stream, thus writing both the 
audio reproduction gap and the audio reproduction gap 
information to a single system stream, and thereby sim- 
plifying the data structure. 

[0780] As a result, the present invention makes it sim- 
ple to reuse, i.e., share, system streams. 
[0781] Furthermore, because the audio reproduction 
gap is contained within a single system stream, the au- 



dio reproduction gap can be moved to any desirable po- 
sition in the system stream. As a result, it is possible to 
move the audio reproduction gap to a silent or other au- 
dibly innocuous location. 

Industrial Applicablity 

[0782] As is apparent from a method and an appara- 
tus according to the present invention for interleaving a 
bitstream to record the interleaved bitstream to a record- 
ing medium and reproduce the recorded bitstream 
therefrom is suitable for the application of an authoring 
system which can generate a new title by editing a title 
constructed by bitstreams carrying various information 
in accordance with the user's request, and is also suit- 
able for a Digital Video Disk System, or DVD System 
being developed recently. 

It follows a list of further embodiments of the invention: 

[0783] Embodiment 1 : An optical disc (M) for record- 
ing one or a plurality of system streams (VOB) contain- 
ing audio data and video data, wherein audio data and 
video data of a plurality of system streams (VOB) re- 
corded to the optical disc (M) are interleaved (multi- 
plexed) such that a difference (Tb1 - Tad1) of the input 
start times of video data and audio data to a buffer 
(2600) in a video decoder and a buffer (2800) in an audio 
decoder is less than the reproduction time of a number 
of audio frames (Af) that can be stored in the audio buffer 
plus one frame. 

[0784] Embodiment 2: An optical disc (M) for record- 
ing one or a plurality of system streams (VOB) contain- 
ing audio data and video data, 

wherein audio data and video data of a plurality of 
system streams (VOB) recorded to the optical disc (M) 
are interleaved (multiplexed) such that a difference be- 
tween the input start times of the video data and audio 
data to a buffer in the video decoder and a buffer in the 
audio decoder is less than the reproduction time of two 
audjo frames (2x Af). 

[0785] Embodiment 3: The optical disc (M) with the 
features of embodiment 1 , wherein audio data and video 
data of a plurality of system streams (VOB) recorded to 
the optical disc (M) are interleaved such that a difference 
(Tvae - Taae) of the input end times of video data and 
audio data to a buffer (2600) in a video decoder and a 
buffer (28Q0) in an audio decoder is less than the re- 
production time of a number of audio frames (Af) that 
can be stored in the audio buffer plus one frame (Af). 
[0786] Embodiment 4: The optical disc (M) with the 
features of embodiment 3, wherein audio data and video 
data of a plurality of system streams (VOB) recorded to 
the optical disc (M) are interleaved such that a difference 
of the input end times of video data and audio data to a 
buffer (2600) in a video decoder and a buffer (2800) in 
an audio decoder is less than the reproduction time of 
two audio frames (2 x Af). 
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[0787] Embodiment 5: An optical disc (M) for record- 
ing one or a plurality of system streams (VOB) contain- 
ing audio data and video data, 

wherein when one or a plurality of system streams 
(VOB) is shared by a plurality of program chains 
(VTS_PGC), and there are at least two different system 
streams (VOB) reproduced immediately after a system 
stream (VOB) shared by a plurality of program chains 
(VTS_PGC) that are not the same in each of the plurality 
of system streams (VOB), 

a same audio content is recorded to at least a first 
audio frame (Af) in at least two system streams (VOB) 
reproduced immediately after a system stream (VOB) 
shared by a plurality of program chains (VTS_PGC). 
[0788] Embodiment 6: The optical disc (M) with the 
features of embodiment 5, wherein when one or a plu- 
rality of system streams (VOB) is shared by a plurality 
of program chains (VTS_PGC), and there are at least 
two different system streams (VOB) reproduced imme- 
diately after a system stream (VOB) shared by a plurality 
of program chains (VTS_PGC) that are not the same in 
each of the plurality of system streams (VOB), 

a same video content is recorded to at least a first 
video frame (Vf) in at least two system streams (VOB) 
reproduced immediately after a system stream (VOB) 
shared by a plurality of program chains (VTS_PGC). 
[0789] Embodiment 7: The optical disc (M) with the 
features of embodiment 6 and embodiment 7, wherein 
when one or a plurality of system streams (VOB) is 
shared by a plurality of program chains (VTS_PGC), 
and there are at least two different system streams 
(VOB) reproduced immediately before a system stream 
(VOB) shared by a plurality of program chains 
(VTS_PGC) that are not the same in each of the plurality 
of system streams (VOB), 

a same video content is recorded to at least a last 
video frame (Vf) in at least two system streams (VOB) 
reproduced immediately before a system stream (VOB) 
shared by a plurality of program chains (VTS_PGC). 
[0790] Embodiment 8: The optical disc (M) with the 
features of embodiment 5, embodiment 6, or embodi- 
ment 7, wherein when one or a plurality of system 
streams (VOB) is shared by a plurality of program chains 
(VTS_PGC), and there are at least two different system 
streams (VOB) reproduced immediately before a sys- 
tem stream (VOB) shared by a plurality of program 
chains (VTS_PGC) that are not the same in each of the 
plurality of system streams (VOB), 

a same audio content is recorded to at least a last 
audio frame (Af) in at least two system streams (VOB) 
reproduced immediately before a system stream (VOB) 
shared by a plurality of program chains (VTS_PGC). 
[0791] Embodiment 9: The optical disc (M) with the 
features of embodiment 5, embodiment 6, embodiment 
7, or embodiment 8, wherein reproduction control infor- 
mation (NV) is provided in a system stream (VOB), and 
audio reproduction stop information 
(VOB_A_STP_PTM1 and VOB_A_GAP_LEN) are writ- 



ten to said reproduction control information. 
[0792] Embodiment 10: A recording method for re- 
cording one or a plurality of system streams (VOB) con- 
taining audio data and video data to an optical disc (M), 
5 whereby one or a plurality of system streams (VOB) are 
recorded with the audio data and video data interleaved 
such that a difference of the input start times of video 
data and audio data to a buffer (2600) in a video decoder 
and a buffer (2800) in an audio decoder is less than the 
reproduction time of a number of audio frames (Af) that 
can be stored in the audio buffer plus one frame. 
[0793] Embodiment 11: The recording method with 
the features of embodiment 1 0, wherein one or a plural- 
ity of system streams (VOB) are recorded with the audio 
data and video data interleaved such that a difference 
of the input end times of video data and audio data to a 
buffer (2600) in a video decoder and a buffer (2800) in 
an audio decoder is less than the reproduction time of 
a number of audio frames (Af) that can be stored in the 
audio buffer (2800) plus one frame. 
[0794] Embodiment 12: A recording method for re- 
cording one or a plurality of system streams (VOB) con- 
taining audio data and video data to an optical disc (M), 
whereby when one or a plurality of system streams 
(VOB) is shared by a plurality of program chains 
(VTS_PGC), and there are at least two different system 
streams (VOB) reproduced immediately after a system 
stream (VOB) shared by a plurality of program chains 
(VTS_PGC) that are not the same in each of the plurality 
of system streams (VOB), 

a same audio content is recorded to at least a first 
audio frame (Af) in at least two system streams (VOB) 
reproduced immediately after a system stream (VOB) 
shared by a plurality of program chains (VTS_PGC). 
[0795] Embodiment 13: The recording method with 
the features of embodiment 12, wherein when one or a 
plurality of system streams (VOB) is shared by a plurality 
of program chains (VTS_PGC), and there are at least 
two different system streams (VOB) reproduced imme- 
diately after a system stream (VOB) shared by a plurality 
of program chains (VTS_PGC) that are not the same in 
each of the plurality of system streams (VOB), 

a same video content is recorded to at least a first 
video frame (Vf) in at least two system streams (VOB) 
reproduced immediately after a system stream (VOB) 
shared by a plurality of program chains (VTS_PGC). 
[0796] Embodiment 14: The recording method with 
the features of embodiment 12 and embodiment 13, 
wherein when one or a plurality of system streams 
(VOB) is shared by a plurality of program chains 
(VTS_PGC), and there are at least two different system 
streams (VOB) reproduced immediately before a sys- 
tem stream (VOB) shared by a plurality of program 
chains (VTS_PGC) that are not the same in each of the 
plurality of system streams (VOB), 

a same video content is recorded to at least a last 
video frame (Vf) in at least two system streams (VOB) 
reproduced immediately before a system stream (VOB) 
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shared by a plurality of program chains (VTS_PGC). 
[0797] Embodiment 15: The recording method with 
the features of embodiment 12, embodiment 13, or em- 
bodiment 14, wherein when one or a plurality of system 
streams (VOB) is shared by a plurality of program chains 
(VTS_PGC), and there are at least two different system 
streams (VOB) reproduced immediately before a sys- 
tem stream (VOB) shared by a plurality of program 
chains (VTS_PGC) that are not the same in each of the 
plurality of system streams (VOB), 

a same audio content is recorded to at least a last 
audio frame (Af) in at least two system streams (VOB) 
reproduced immediately before a system stream (VOB) 
shared by a plurality of program chains (VTS_PGC). 
[0798] Embodiment 16: The recording method with 
the features of embodiment 12, embodiment 13, em- 
bodiment 14, or embodiment 15, wherein reproduction 
control information (NV) is provided in a system stream 
(VOB), and audio reproduction stop information 
(VOB_A_STP_PTM1 and VOB_A_GAP_LEN) are writ- 
ten to said reproduction control information. 
[0799] Embodiment 17: An optical disc reproduction 
apparatus (DCD) comprising a data read means for 
reading reproduction control information from an optical 
disc (M) to which is recorded a system stream (VOB) 
containing reproduction control information declaring 
audio reproduction stop information 
(VOB_A_STP_PTM1 and VOB_A_GAP_LEN), and 

an audio reproduction stop control 
(VOB_A_STP_PTM1 and VOB_A_GAP_LEN) means 
for stopping audio reproduction based on the read re- 
production control information. 

[0800] Embodiment 18: A method for reproducing da- 
ta from an optical disc (M), whereby reproduction control 
information (NV) is read from an optical disc (M) to which 
is recorded a system stream (VOB) containing repro- 
duction control information declaring audio reproduction 
stop information (VOB_A_STP_PTM1 and 
VOB_A_GAP_LEN), and 

audio reproduction stop(VOB_A_STP_PTM1 and 
VOB_A_GAP_LEN) control stops audio reproduction 
based on the read reproduction control information 
(NV). 



Claims 

1. A data storage medium storing at least a first and a 
second system stream, said first and a second sys- 
tem stream each containing 

a plurality of video packets, each video packet 
carrying encoded video data of one video frame in 
a compressed manner; and 

a plurality of audio packets, each audio packet 
carrying encoded audio data of one audio frame in 
a compressed manner, 

wherein said one audio frame being different 
in time length from said one video frame, 



said data storage medium further storing an 
audio gap information that indicates - in a case 
where decoded video data decoded from the sec- 
ond system stream are to be played seamlessly 
5 continuing after the decoded video data decoded 
from the first system stream - that an audio discon- 
tinuous part is to be inserted in or at the end of said 
decoded audio data decoded from the first system 
stream before decoded audio data of the second 
10 system stream are to be played after the decoded 
audio data of the first system stream. 

2. An data storage medium according to claim 1, 
wherein said one audio frame is shorter in time 

15 length than said one video frame. 

3. A data storage medium according to claim 1 or 2, 
wherein audio gap information is included in a nav- 
igation pack of said first system stream and further 

20 indicates the duration of said audio discontinuous 
part appearing, during audio presentation, in de- 
coded audio data decoded from the first encoded 
audio data. 

25 4. A reproducing apparatus for reproducing a data 
storage medium according to claim 1, said repro- 
ducing apparatus comprising: 

a reading arrangement which reads the video 
30 packets and the audio packets from each of the 

first and second system streams; 
a video decoder which decodes the video pack- 
ets to produce decoded video data for video 
presentation; 

35 an audio decoder which decodes the audio 

packets to produce decoded audio data for au- 
dio presentation; and 

a controller for controlling the reading arrange- 
ment, the video decoder and the audio decoder 
40 to play said second system stream after said 

first system stream such that, decoded video 
data decoded from the second encoded video 
data is played seamlessly continuing after the 
play of decoded video data decoded from the 
45 first encoded video data, whereas decoded au- 

dio data decoded from the second encoded au- 
dio data is played after the play of decoded au- 
dio data decoded from the first encoded audio 
data, wherein an audio discontinuous part ap- 
50 pearing in or at the end of said decoded audio 

data decoded from the first encoded audio da- 
ta. 

5. A reproducing method for reproducing a data stor- 
55 age medium according to claim 1 , said reproducing 
method comprising: 

reading the video packets and the audio pack- 
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ets from each of the first and second system 
streams; 

decoding the video packets to produce decod- 
ed video data for video presentation; 
decoding the audio packets to produce decod- 5 
ed audio data for audio presentation; and 
controlling the reading arrangement, the video 
decoder and the audio decoder to play said 
second system stream after said first system 
stream such that decoded video data decoded 10 
from the second encoded video data is played 
seamlessly continuing afterthe play of decoded 
video data decoded from the first encoded vid- 
eo data, whereas decoded audio data decoded 
from the second encoded audio data is played 15 
after the play of decoded audio data decoded 
from the first encoded audio data, wherein an 
audio discontinuous part appearing in or at the 
end of said decoded audio data decoded from 
the first encoded audio data. 20 
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Fig.11 
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Fig. 13 
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